Interchange Formats

The snail specification provides formats for serializing tokens and abstract syntax trees. These formats designed to be simple to parse either manually or using common tools.

SL-LEX Format

The SL-LEX format is used to save a sequence (stream) of tokens to a file. Files with the .sl-lex suffix follow a simple serialization format.

Each token is represented by a triple (or quadruple) of lines. The first line holds the line number. The second line holds the column number. The third line gives the name of the token. The optional fourth line holds additional information (i.e., the lexeme) for identifiers, integers, and strings.

Line and Column Numbers

The first line in a file is line 1. Each successive newline character (\n) increments the line count. Column numbers begin with column 1 and resets on each newline.

Example

Given the following input:

1
2
Backslash !
   "allowed"

The corresponding SL-LEX output is:

1
1
ident
Backslash
1
11
not
2
5
string
allowed

Note that the lexeme for a string literal does not include the double quotes, and thus the column number is that of the first character of the string itself.

The table below maps tokens to their SL-LEX name.

SL-LEX name Token
at @
assign =
class\(^\dagger\) class
colon :
comma ,
divide /
dot .
else\(^\dagger\) else
equals ==
false\(^\dagger\) false
ident\(^*\) identifier
if\(^\dagger\) if
int\(^*\) integer literal
isvoid\(^\dagger\) isvoid
lbrace {
lbracket [
let\(^\dagger\) let
lparen (
lt <
lte <=
minus -
new\(^\dagger\) new
not !
plus +
rbrace }
rbracket ]
rparen )
semi ;
string\(^*\) string literal
times *
true\(^\dagger\) true
uminus ~
while\(^\dagger\) while

\(^*\) Contains fourth line with contents of lexeme
\(^\dagger\) Case insensitive

SL-AST Format

The SL-AST format is used to save an abstract syntax tree representing a syntactically valid snail program to a file. Files with the .sl-ast suffix are formatted using JSON, a common data interchange format supported by many programming languages.

A snail AST is structured as an array of JSON object elements, each representing a class in the source program. Beyond requiring proper nesting of object (and the correct ordering of arguments and parameters), SL-AST does not specify a particular ordering of parameters in objects, order of features in classes, or order of classes in a program. It is generally good practice to maintain the same order as the source snail program, however.

Formal Specification

SL-AST is formally specified by a JSON schema. The most recent version of the schema is available here. Any number of JSON schema validator tools may be used to verify the conformity of an SL-AST file to this specification.

Informal Specification

The following is an informal discussion of each of the object types found in an SL-AST file. The formal specification should be referenced for precise descriptions.

Classes

The following object structure is used to define a class:

{
    "class_name": < name of class >,
    "inherits": < name of class to inherit from >,
    "members": [ < array of member objects > ],
    "methods": [ < array of method objects > ]
}

The class_name, members, and methods properties are required.

Members

The following object structure is used to define a member variable:

{
    "name": { < identifier object of the variable name > },
    "type": "member",
    "init": { < expression object of the initializer value > }
}

The name and type properties are required.

Methods

The following object structure is used to define a method:

{
    "name": { < identifier object of the method name > },
    "type": "method",
    "parameters": [ < array of identifier objects > ],
    "body": { < expression object of the body > }
}

All properties are required. Note that the body property will always be a block expression.

Identifiers

The following object structure is used to define an identifier:

{
    "line": < integer line number >,
    "col": < integer column number >,
    "value": < string of the identifier name >
}

All properties are required. The line and column numbers are positive and one-indexed. They always refer to the first character in the lexeme associated with the token.

Expressions

The following object structure is used to define an expression:

{
    "line": < integer line number >,
    "col": < integer column number >,
    "value": { < expression_value object > }
}

All properties are required. The line and column numbers are positive and one-indexed. They always refer to the first character of the first lexeme associated with the expression. The value property will be one of several valid objects described below.

Expression Values

The following expression values are supported. All properties are required unless otherwise noted.

  • Assignment Expression (id = exp)
    {
        "type": "assign",
        "lhs": { < identifier object of id > },
        "rhs": { < expression object of exp > }
    }
    
  • Array Assignment Expression (e1[e2] = e3)
    {
        "type": "array-assign",
        "lhs": { < expression object of e1 > },
        "index": { < expression object of e2 > }, 
        "rhs": { < expression object of e3 > }
    }
    
  • Dynamic Dispatch Expression (e1.id(args))
    {
        "type": "dynamic-dispatch",
        "object": { < expression object of e1 > },
        "method": { < identifier object of id > }, 
        "args": [ < array of argument expression objects > ]
    }
    
  • Static Dispatch Expression (e1@id1.id2(args))
    {
        "type": "static-dispatch",
        "object": { < expression object of e1 > },
        "class": { < identifier object of id1 > },
        "method": { < identifier object of id2 > }, 
        "args": [ < array of argument expression objects > ]
    }
    
  • Self Dispatch Expression (id(args))
    {
        "type": "self-dispatch",
        "method": { < identifier object of id > }, 
        "args": [ < array of argument expression objects > ]
    }
    
  • If Expression (if(e1) e2 else e3)
    {
        "type": "if",
        "guard": { < expression object of e1 > },
        "then": { < expression object of e2 > },
        "else": { < expression object of e3 > }
    }
    

    Note that the then and else properties will always be block expressions in snail.

  • While Expression (while(e1) e2)
    {
        "type": "while",
        "guard": { < expression object of e1 > },
        "body": { < expression object of e2 > },
    }
    

    Note that the body property will always be a block expression in snail.

  • Block Expression ({ e1; e2; ... })
    {
        "type": "block",
        "body": [ < array of expression objects > ]
    }
    
  • Let Expression (let id [= exp]?)
    {
        "type": "let",
        "lhs": { < identifier object of id > },
        "rhs": { < expression object of exp > }
    }
    

    Note that property rhs is optional and is only provided if the local variable is initialized.

  • New Expression (new Class)
    {
        "type": "new",
        "class": { < identifier object of Class > }
    }
    
  • New Array Expression (new[exp] Array)
    {
        "type": "new-array",
        "size": { < expression object of exp > }
    }
    
  • Is Void Expression (isvoid(exp))
    {
        "type": "isvoid",
        "body": { < expression object of exp > }
    }
    
  • Addition Expression (e1 + e2)
    {
        "type": "plus",
        "lhs": { < expression object of e1 > },
        "rhs": { < expression object of e2 > }
    }
    
  • Subtraction Expression (e1 - e2)
    {
        "type": "minus",
        "lhs": { < expression object of e1 > },
        "rhs": { < expression object of e2 > }
    }
    
  • Multiplication Expression (e1 * e2)
    {
        "type": "times",
        "lhs": { < expression object of e1 > },
        "rhs": { < expression object of e2 > }
    }
    
  • Division Expression (e1 / e2)
    {
        "type": "divide",
        "lhs": { < expression object of e1 > },
        "rhs": { < expression object of e2 > }
    }
    
  • Equals Comparison Expression (e1 == e2)
    {
        "type": "equals",
        "lhs": { < expression object of e1 > },
        "rhs": { < expression object of e2 > }
    }
    
  • Less-Than Comparison Expression (e1 < e2)
    {
        "type": "lt",
        "lhs": { < expression object of e1 > },
        "rhs": { < expression object of e2 > }
    }
    
  • Less-Than-Or-Equal-To Comparison Expression (e1 <= e2)
    {
        "type": "lte",
        "lhs": { < expression object of e1 > },
        "rhs": { < expression object of e2 > }
    }
    
  • Not Expression (!exp)
    {
        "type": "not",
        "body": { < expression object of exp > }
    }
    
  • Negative Expression (~exp)
    {
        "type": "negate",
        "body": { < expression object of exp > }
    }
    
  • Array Access Expression (e1[e2])
    {
        "type": "array-access",
        "object": { < expression object of e1 > },
        "index": { < expression object of e2 > }
    }
    
  • Identifier Expression (id)
    {
        "type": "identifier",
        "value": { < identifier object of id > }
    }
    
  • Number Expression (int)
    {
        "type": "number",
        "line": < integer line number >,
        "col": < integer column number >,
        "value": < integer value of literal >
    }
    

    Note that line and col properties are non-zero, one-indexed integers representing the line and column of the first digit in the number lexeme.

  • String Expression (string)
    {
        "type": "string",
        "line": < integer line number >,
        "col": < integer column number >,
        "value": < string value of literal >
    }
    

    Note that line and col properties are non-zero, one-indexed integers representing the line and column of the first digit in the string lexeme.

  • Boolean Expression (bool)
    {
        "type": "bool",
        "value": < boolean value of bool >
    }
    

Updated: