Lexical Structure¶

The lexical structure of EasyLang defines how your code is broken into tokens.
Tokens are the smallest meaningful pieces of a program — words, numbers, symbols, keywords, operators, etc.

This section explains how EasyLang reads your source code before parsing it into statements and expressions.

Overview¶

EasyLang’s lexer (tokenizer) splits code into:

Keywords
Identifiers
Numbers
Strings
Operators
Symbols (brackets, parentheses, commas…)
Comments
Whitespace (ignored except for newlines)

Everything here is based on the actual token rules used in the lexer inside the interpreter.

Comments¶

Comments are ignored by the interpreter.

Single-line Comments¶

$ this a comment

These last until the end of the line.

Multi-line Comments¶

$$
    This is a multi-line comment
$$

Useful for explanations or temporarily disabling blocks of code.

Keywords¶

Keywords are reserved words with special meaning.
You cannot use these as variable names.

we let
so
print 
read
true 
false 
not
equals 
not equals 
less
greater 
plus 
minus
mul 
div 
and
or 
if 
then
else else if 
repeat
while 
from 
to
do 
define 
return
bring 
as 
open
close 
writeline 
readline
for 
into 
with
continue 
break

These keywords map directly to tokens in the lexer.

Identifiers (Variable Names)¶

Identifiers refer to variable names, function names, module aliases, and dictionary keys.

Rules:¶

Must start with a letter (A–Z or a–z) or _
After that, can include letters, digits, and _

Examples:

x
name
user_age
total2
_value

Invalid Identifiers:

2x (starts with a number)
true (keyword)
if (keyword)
minus (keyword)

If a keyword is used as an identifier, the parser produces a friendly error.

Numbers¶

EasyLang supports integers and floating-point numbers.

Integer examples:¶

10
0
999

Float examples:¶

3.14
0.001
10.0

Numbers are tokenized automatically based on digits and optional decimal points.

Strings¶

Strings are always enclosed in double quotes:

"hello"
"EasyLang is cool!"
"123"
"line one\nline two"

Notes:¶

No single-quoted strings
No multi-line strings
The interpreter removes the quotes and gives you the raw text

Boolean Literals¶

EasyLang supports:

true
false

These map to Python True and False values at runtime.

Symbols & Operators¶

EasyLang supports two types of symbols:

1. Punctuation / Structural Symbols¶

Symbol	Meaning
`[`	start block / list
`]`	end block / list
`{`	dictionary start
`}`	dictionary end
`(`	function call start
`)`	function call end
`,`	argument separator
`:`	block/function separator
`.`	attribute or method access

Blocks use:

[
statements...
]

2. Operators¶

Operators can be written in English words OR in symbol form.

Arithmetic Operators¶

English form	Symbol	Meaning
`plus`	`+`	addition
`minus`	`-`	subtraction
`mul`	`*`	multiplication
`div`	`/`	division

Examples:

a plus b
x minus 5
y mul 3
value / 10

Comparison Operators¶

English form	Symbol	Meaning
`equals`	`==`	equality
`not equals`	`!=`	inequality
`less`	`<`	less than
`greater`	`>`	greater than
—	`<=`	less-or-equal
—	`>=`	greater-or-equal

Examples:

if x equals 10 then [...]
if name not equals "John" then [...]

Logical Operators¶

Operator	Meaning
`and`	logical AND
`or`	logical OR
`not`	logical NOT

Example:

if is_admin and logged_in then [...]

Whitespace¶

Whitespace (spaces and tabs) is ignored by the lexer.

Newlines separate statements unless inside blocks.

For example:

we let x = 10
we let y = 20
so print x plus y

Inside block:

[
we let x = 10
we let y = 20
]

No indentation rules — only brackets matter.

Token Examples¶

Here is how the lexer would tokenize a small program:

we let x = 10
so print x plus 5

Produces tokens like:

WELET     "we let"
ID        "x"
ASSIGN    "="
NUMBER    10
SO        "so"
PRINT     "print"
ID        "x"
PLUS      "plus"
NUMBER    5

This matches the implementation in the lexer of the interpreter.

Summary¶

EasyLang's lexical structure is simple, consistent, and heavily English-inspired: - English keywords - Natural Operators - No punctuation-heavy syntax - Strings use quotes - Numbers are auto-detected - Blocks use [ and ] - Comments begin with $ or $$...$$

Next Steps¶

Continue to Grammer to learn how these tokens form full programs.