Snail Lexical Structure
Lexical Structure
The lexical structure is designed to be fairly simple to implement. There are some deviations from popular programming languages (e.g., handling of escape sequences); these are typically to reduce the burden of implementing lexical analysis.
Integers
Integers are non-empty strings of digits 0-9. It is a lexer error if the a literal integer constant is too big to be represented as a 64-bit signed integer. 64-bit signed integers range from \(-2^{63}\) to \(2^{63}-1\).
Identifiers
Identifiers are strings (other than keywords) that conform to a modified Unicode’s identifier specification. In particular, snail uses the “XID” identifier specification. In addition, snail allows identifiers to start with an “_” character. (i.e both “_snail” and “_” are valid identifiers)
Both variable names and class names are treated as identifiers in snail. Notably, keywords, are treated separately from identifiers.
String Literals
Strings are enclosed in double quotes (i.e., “…”). Snail only recognizes two
escape sequences, \"
and \\
, during lexing. This allows double quotes to
be embedded in a string literal and a backslash to be the last character in a
string.
"Grace Hopper said, \"A ship in port is safe, but that’s not what ships are built for.\""
Two escape sequences are handled by the IO
module: \n
and \t
. These are
not interpreted or transformed by the lexer in any way.
Note that escape sequences are not converted to a single character. They are
left as a \
followed by the next character. This will simplify lexing and
interpretation, but differs from other languages.
class Main : IO {
main() {
print_string("She said, \"What?\"");
};
};
// outputs: She said, \"What?\"
Strings may not contain the null character (with integer value 0). Newline characters and carriage returns are also not allowed. All other characters are allowed.
Strings must have an opening and closing quote. The lexer must reject source code that contains malformed strings.
Comments
Line comments begin with //
and continue to the next newline (or end of file).
Block comments are also supported using /* ... */
syntax. Block comments may
be nested and must be terminated before the end of file.
Keywords
The following identifiers are treated as keywords in snail:
class
else
if
isvoid
let
new
while
true
false
All keywords in snail are case insensitive. That is, any capitalization of a keyword is still treated as a keyword.
Whitespace
Snail follows the definition of whitespace provided by the Unicode specification.
Operators and Punctuation
Refer to Langauge Basics or SL-LEX Format for additional lexemes in the language.