The well-known parser generator PEG.js originates from David Majda and is now being continued by Futago-za Ryuu. The generator can also be operated purely online without prior installation.
In the lecture "Fundamentals of Computer Science" at the University of Applied Sciences in Stuttgart, students can use PEG.js to actively familiarise themselves with the syntax (not only) of programming languages and the necessary steps in analysing and processing texts.
A guide explains how to use the pages in this website, their behaviour may be adjusted in the settings.
In a first step, an XML file is analysed and converted into an equivalent JSON file.
In order to serve as a teaching example in the limited time available, the grammar used covers only the most important aspects of XML.
XML = Declaration _ Element _
Declaration = '<?xml' (__ Attribute)* _ '?>'
Element = emptyElement / filledElement
emptyElement = '<' _ Tag (__ Attribute)* _ '/>'
filledElement = '<' _ Tag (__ Attribute)* _ '>' (_ Content)* _ '</' _ Tag _ '>'
Content = Element / Text / CDATA
Attribute = Name (_ '=' _ '"' Value '"')?
CDATA = '<![CDATA[' [^\]]* ']]>'
Tag = [a-zA-Z]+
Name = [a-zA-Z]+
Text = [^<]+
Value = [^"]*
_ = [ \t\n\r]*
__ = [ \t\n\r]+
Syntax diagrams are sometimes easier to understand (especially for beginners) than an abstract (E)BNF or PEG notation.
The well-known Railroad Diagram Generator generates the corresponding syntax diagrams for a given grammar (in a BNF-like notation).
XML ::= Declaration Element
Declaration ::= '<?xml' Attribute* '?>'
Element ::= emptyElement | filledElement
emptyElement ::= '<' Tag Attribute* '/>'
filledElement ::= '<' Tag Attribute* '>' Content* '</' Tag '>'
Content ::= Element | Text | CDATA
Attribute ::= Name ('=' '"' Value '"')?
CDATA ::= '<![CDATA[' [^#x5D]* ']]>'
Tag ::= [a-zA-Z]+
Name ::= [a-zA-Z]+
Text ::= [^<]+
Value ::= [^"]*
Decorated with appropriate JavaScript instructions, the pure XML grammar for PEG.js becomes a converter that generates the corresponding JSON equivalent from a given XML file.
{ function indented (Text) {
return Text.split('\n').map(Line => ' ' + Line).join('\n')
} }
XML = Declaration _ Element:Element _
{ return Element }
Declaration = '<?xml' (__ Attribute)* _ '?>'
Element = emptyElement / filledElement
emptyElement = '<' _ Tag:Tag Attributes:(__ Attribute:Attribute { return Attribute })* _ '/>'
{ return '{\n' + indented(
' "Tag":"' + Tag + '"' + (
Attributes.length > 0
? ',\n "Attributes": {' + Attributes.join(', ') + '}'
: ''
) + '\n}')
}
filledElement = '<' _ StartTag:Tag Attributes:(__ Attribute:Attribute { return Attribute })*
_ '>' Contents:(_ Content:Content {return Content})*
_ '</' _ EndTag:Tag _ '>'
{ if (StartTag !== EndTag) {
error('end tag differs from start tag')
}
return '{\n' + indented(
' "Tag":"' + StartTag + '"' + (
Attributes.length > 0
? ',\n "Attributes": {' + Attributes.join(', ') + '}'
: ''
) + (
Contents.length > 0
? ',\n "Contents": [' + Contents.join(', ') + ']'
: ''
) + '\n}')
}
Content = Element / Text / CDATA
Attribute = Name:Name Value:(_ '=' _ '"' inner:Value '"' { return inner })?
{ return (
Value == null
? Name + ':true'
: Name + ':' + Value
) }
CDATA = '<![CDATA[' Content:[^\]]* ']]>'
{ return '"' + Content.join('').replace(/"/g,'\\"') + '"' }
Tag = [a-zA-Z]+ { return text() }
Name = [a-zA-Z]+ { return '"' + text() + '"' }
Text = [^<]+ { return '"' + text().replace(/"/g,'\\"') + '"' }
Value = [^"]* { return '"' + text() + '"' }
_ = [ \t\n\r]* { return '' }
__ = [ \t\n\r]+ { return '' }
The example file is an article for a fictitious blog.
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Mustermann's Blog</title>
<link href="/atom.xml" rel="self"/>
<author>
<name>Mustermann</name>
<email>Max.Mustermann@gmx.de</email>
</author>
<entry>
<title>Just a simple Blog Entry</title>
<content type="html"><![CDATA[<p>...</p>]]></content>
</entry>
</feed>
A file converter is already quite practical, but computer science students actually thirst for their own programming language.
Thanks to PEG.js, the effort required is very manageable, even for a beginner:
Et voilà, you have specified and implemented your own scripting language!
As an example, let's use "slang", a very simple scripting language, which is primarily intended to demonstrate the basic mechanisms of an interpreter (and less for practical use).
slang only knows the following data types
nil
- represents "nothing", i.e. an undefined value (JavaScript undefined
)In particular, slang does not know any data structures (this makes the grammar a bit simpler)
Functions and blocks are "first-class values", i.e. they can be assigned to variables and used as arguments for function calls.
Values of the above data types are defined in the program text as follows
nil
true
or false
0b
(e.g. 0b01101100
) with up to 32 digits/bits0x
(i.e. 0x0123AFFE
) with up to 8 digits (32 bits)\b
, \f
, \n
, \r
, \t
, \v
, \0
, \'
, \"
, \\
as well as \x##
and \u####
){
...}
(...) => {...}
.native (...) =>
string"Identifiers" are the names of context entries, i.e. global, local or lexical variables and functions.
They begin with one of the letters a-z or A-Z or one of the two characters $ or _, optionally followed by one or more further letters, $ or _ or decimal digits (0-9).
Variables and functions, however, must not be named like a reserved word. Reserved words are global
, local
, it
, native
, nil
, true
, false
, nan
, e
, pi
, div
, mod
, and
, or
, not
, if
, then
, else
, select
, when
, otherwise
, for
, from
, down
, to
, by
, repeat
, while
, until
, continue
, break
and return
.
The case of the identifiers is significant.
Contexts group values and functions together so that they can be addressed from functions and blocks by their name.
There are "outer", "inner" and "lexical" contexts.
The outermost context is the "global context" with all global values and functions.
When a function is started, it creates an "inner" context for call parameters and local variables. When reading, context entries are first searched locally and then globally. When writing, entries are always searched for in the local context only (and created there if necessary).
When a block is started, it also creates a local context, but this time in the "lexical" context of the place where it was defined. When reading, context entries are therefore first searched for in the local context, then (recursively if necessary) in the associated lexical contexts and finally also in the global context. When writing, entries are also searched in the local and lexical contexts - but never in the global context. Found entries are changed where they were found, entries not found are created locally.
Blocks contain sequences of instructions that can (but do not have to) be separated from each other in the program text by one or more semicolons, if desired.
Stand-alone expressions are also considered to be instructions.
Each executed instruction returns a result, this is stored in the local variable it
.
The result of a block is the result of the last executed instruction.
Lambdas are (anonymous) functions with 0, 1 or more "parameters". They are defined by expressions of the form
() => {...}
for parameterless lambdas,(parameter) => {...}
for lambdas with one parameter,(parameter, parameter, ...) => {...}
for lambdas with multiple parameters.The parameters are available within the function as local variables and are initially preassigned with the values of the arguments passed during the call.
Lambdas can be called immediately, passed as arguments to other functions or assigned to a variable.
The result of a function is
return
statement."Native functions" are functions programmed in JavaScript and callable from slang and are used to extend the scripting language. They are defined by expressions of the form
native () =>
string for parameterless functions,native (parameter) =>
string for functions with a parameter,native (parameter, parameter, ...) =>
string for functions with multiple parameters.The source text of the native function is given (within slang) as a single or multi-line slang string.
The parameters of the JavaScript function carry the names specified in the definition; the lexical context in which the native function was called is available as a "this" object.
Expressions consist of
slang supports the following operators (with decreasing precedence)
(
...)
(parenthesis)^
(exponentiation)*
, /
, div
(integer division), mod
(modulo division)+
, -
<
, <=
, =
, >=
, >
and <>
(unequal)not
(logical negation)or
(logical disjunction)and
(logical conjunction):=
(assignment)The proper order of mathematical operations (PEMDAS) is taken into account.
Most operators are "left-associative", only assignments, exponentiations and logical negation are evaluated "right-associative".
Assignments change the value of an existing context entry (usually a variable) or create it.
A "generic assignment" has the form
identifier := expression
and changes an existing entry in the local or lexical context or creates a new entry in the local context.
A "local assignment" has the form
local identifier := expression
and always writes an entry into the local context.
A "global assignment" has the form
global identifier := expression
and writes an entry into the global context. Global assignments are the only way to create or change entries in the global context.
Assignments also return a result (and can thus be used in expressions), namely the value that was assigned.
The following statements are available to control program flow:
if (...) then {...}
if (...) then {...} else {...}
if
is true
, the block noted after then
is executed. If the expression is false
, the block noted after else
is executed instead - if such a block exists, otherwise the statement is simply ignored. The result of the statement is the result of the executed block or nil if no block was executed.select {
when (...) then {...}
otherwise {...}
}
when
in order, stops as soon as the first expression with the value true
is found and then executes the block after the corresponding then
. If no when
expression returns the value true
, the block after otherwise
is executed instead - if one exists, otherwise the instruction is ignored. The result of the statement is the result of the executed block or nil if no block was executed.for identifier from ... to ... by ... repeat {...}
for identifier from ... down to ... by ... repeat {...}
repeat
repeatedly and with different values for the given identifier
, where these values start with the value of the expression after from
and are changed by the value of the expression after by
on each pass. In the version with the keyword down
, the loop ends when the value of the expression after to
falls below it, otherwise when it exceeds it. If the expression after by
has the value nil, +1 (in the version without down
) or -1 (in the version with down
) is assumed instead. The identifier
may denote an already existing variable from the lexical context, otherwise the corresponding variable is created in the context of the loop. Local variables created in the loop block retain their value from one pass to the next. The result of the loop is the result of the last execution of the loop blockwhile (...) repeat {...}
repeat
repeatedly as long as the expression noted after while
returns the value true
. Local variables created in the loop block retain their value from one run to the next, and the termination condition is also executed in the context of the loop block. The result of the loop is the result of the last execution of the loop block.repeat {...} until (...)
repeat
repeatedly until the expression noted after until
returns the value true
. Local variables created in the loop block retain their value from one run to the next, and the termination condition is also executed in the context of the loop block. The result of the loop is the result of the last execution of the loop block.break
continue
return ...
return
- if such an expression is missing, nil
is returnedslang supports single-line and (non-nesting) multi-line comments:
// ... to end of line
/* ...single or multiline, non-nestable */
Just like "real" programming languages, slang also comes with a basic set of "intrinsic" (i.e. built-in) values and functions. These are defined in the global context and thus accessible from everywhere - as long as they are not "shadowed" by functions with the same name in the local context.
nil
- represents "nothing", i.e. an undefined value (JavaScript undefined
)true
- represents the logical value for "true"false
- represents the logical value for "false"pi
- contains the value for the circle number pie
- contains the value for the Euler constantThe logical, mathematical and comparative operators can also be invoked as functions - by overwriting these functions, the behaviour of the associated operators can thus be changed. Incidentally, the parser calls operators in expressions directly from the global context, so that local functions of the same name cannot override operators.
neg(a)
- returns number a
with reversed sign (i.e. -a
)plus(a,b)
- adds two numbers a
and b
or appends strings to each otherminus(a,b)
- subtracts number b
from number a
.times(a,b)
- calculates the product of two numbers a
and b
.through(a,b)
- divides number a
by number b
.div(a,b)
- divides number a
by number b
and returns the integer part of the resultmod(a,b)
- returns the remainder of the division of number a
by number b
.raised(a,b)
- returns the b
th power of number a
.
lt(a,b)
- returns true
if and only if a < b
holds truele(a,b)
- returns true
if and only if a <= b
holds trueeq(a,b)
- returns true
if and only if a = b
holds truege(a,b)
- returns true
if and only if a >= b
holds truegt(a,b)
- returns true
if and only if a > b
holds truene(a,b)
- returns true
if and only if a <> b
holds true
not(a)
- returns the logical negation of the boolean value a
.and(a,b)
- returns the logical conjunction of boolean values a
and b
.or(a,b)
- returns the logical disjunction of the boolean values a
and b
.The following mathematical functions are built into slang:
is_nan(a)
- returns true
if and only if a
has the value NaN.
sqrt(a)
- returns the square root of a
sin(a)
- returns the sine of a
cos(a)
- gives the cosine of a
tan(a)
- returns the tangent of a
rnd(a)
- returns a pseudo-random number in the range 0...a
, where a
itself is excludedThe instructions for controlling the program flow are also present as global functions and can consequently be changed. Incidentally, the parser calls statements directly from the global context, so that local functions of the same name cannot override statements.
if_then_else (condition, then_clause, else_clause)
condition
is true
, block then_clause
is executed; if it is false
, block else_clause
is executed - if such a block exists, otherwise the statement is simply ignored. The result of the function is the result of the executed block or nil
if no block was executed.select_when (otherwise_clause, condition, block,...)
condition
blocks or (logical) values of the given condition-block
pairs in order, stops as soon as the first condition
with the result true
is found and then executes the corresponding block
. If no condition
returns the value true
, the otherwise_clause
is executed instead. The result of the function is the result of the executed blockfor_repeat (identifier, downwards, start_value, stop_value, step_value, loop_body)
loop_body
repeatedly with different values for the given identifier
, these values start with start_value
and are incremented by step_value
on each pass. If the boolean argument downwards
has the value true
, the loop ends when the value of identifier
falls below stop_value
, otherwise when it exceeds that limit. If step_value
has the value nil
, +1
(if downwards = false
) or -1
(if downwards = true
) is assumed instead. The identifier
may denote an already existing variable from the lexical context, otherwise the variable is created in the context of the loop. Local variables created in loop_body
retain their value from one run to the next. The result of the loop is the result of the last execution of loop_body
.while_repeat (condition, loop_body)
loop_body
repeatedly as long as an execution of the block condition
returns the value true
. Local variables created in loop_body
retain their value from one run to the next. The result of the loop is the result of the last execution of loop_body
.repeat_until (loop_body, condition)
loop_body
repeatedly until an execution of the block condition
returns the value true
. Local variables created in loop_body
retain their value from one run to the next. The result of the loop is the result of the last execution of loop_body
.break (value)
value
.continue
return (value)
value
.assert (value)
value
has the value true
- if so, the program execution will continue, otherwise it will be aborted with an error messageassert_failure (block)
block
- if so, the error is ignored and program execution is continued, otherwise it is aborted with an error messagelog (value)
value
to the browser consoleThe grammar for slang in PEG.js notation does not require a separate "lexer" (as usual) and is nevertheless surprisingly compact.
script = statement_list? _
block = '{' statement_list _ '}'
statement_list = (_ statement / _ ';' )*
statement = if_statement / select_statement
/ for_statement / while_statement / until_statement
/ continue_statement / break_statement / return_statement
/ expression
if_statement = 'if' _ '(' _ expression _ ')' _ 'then' _ block (_ 'else' _ block)?
select_statement = 'select' _ '{' _ when_clause* otherwise_clause? '}'
when_clause = 'when' _ '(' _ expression _ ')' _ 'then' _ block _
otherwise_clause = 'otherwise' (__ 'do')? _ block _
for_statement = 'for' __ identifier _ 'from' alone _ expression _
('down' _)? 'to' alone _ expression _ ('by' alone _ expression)? _
'repeat' _ block
while_statement = 'while' _ '(' _ expression _ ')' _ 'repeat' _ block
until_statement = 'repeat' _ block _ 'until' _ '(' _ expression _ ')'
continue_statement = 'continue' alone
break_statement = 'break' alone
return_statement = 'return' alone _ expression?
expression = assignment / or_term
or_term = (and_term _ 'or' alone _)* and_term
and_term = (not_term _ 'and' alone _)* not_term
not_term = ('not' alone _)* comparison
comparison = (additive_term _ ('<' ![=>] / '<=' / '=' !'>' / '>=' / '>' !'=' / '<>') _)? additive_term
additive_term = (multiplicative_term _ ('+' / '-') _)* multiplicative_term
multiplicative_term = (exponential_term _ ('*' / '/' / 'div' alone / 'mod' alone) _)* exponential_term
exponential_term = invocation (_ '^' _ exponential_term)?
invocation = primary ( _ '(' _ argument_list? _ ')')?
primary = literal / identifier / '(' _ expression _ ')' ! (_ '=>')
/ lambda_definition / native_definition
assignment = global_assignment / local_assignment / generic_assignment
global_assignment = 'global' __ identifier _ ':=' _ expression
local_assignment = 'local' __ identifier _ ':=' _ expression
generic_assignment = identifier _ ':=' _ expression
lambda_definition = '(' _ parameter_list? _ ')' _ '=>' _ block
native_definition = 'native' _ '(' _ parameter_list? _ ')' _ '=>' _ string
parameter_list = identifier (_ ',' _ identifier)*
argument_list = expression (_ ',' _ expression)*
literal = 'nil' alone / boolean / number / string
boolean = 'true' alone / 'false' alone
number = integer / floating_point / 'pi' alone / 'e' alone / 'nan' alone
integer = binary / decimal / hexadecimal
binary = '0b' [01]+
decimal = [+-]? digit+ ! '.'
hexadecimal = '0x' hex_digit+
floating_point = mantissa exponent?
mantissa = [+-]? (digit+ '.' digit* / '.' digit+)
exponent = [eE] [+-]? digit+
string = single_quoted_text / double_quoted_text /
single_quoted_string / double_quoted_string
single_quoted_string = "'" (escape_sequence / & no_control_char [^\'])* "'"
double_quoted_string = '"' (escape_sequence / & no_control_char [^\"])* '"'
single_quoted_text = "'''" (escape_sequence / !"'''" [\'] / [^\'])* "'''"
double_quoted_text = '"""' (escape_sequence / !'"""' [\"] / [^\"])* '"""'
escape_sequence = ('\\' [bfnrtv0'"\\]) / ('\\x' hex_digit{2}) / ('\\u' hex_digit{4})
no_control_char = [^\x00-\x1F\x7F-\x9F\u200B\u2028\u2029\u2060]
digit = [0-9]
hex_digit = [0-9a-fA-F]
identifier = ! (reserved_word alone) identifier_start identifier_part*
identifier_start = [$_a-zA-Z]
identifier_part = [$_a-zA-Z0-9]
alone = ! identifier_part
reserved_word = 'global' / 'local' / 'it' / 'native'
/ 'nil' / 'true' / 'false' / 'nan' / 'e' / 'pi'
/ 'div' / 'mod' / 'and' / 'or' / 'not'
/ 'if' / 'then' / 'else' / 'select' / 'when' / 'otherwise'
/ 'for' / 'from' / 'down' / 'to' / 'by' / 'repeat'
/ 'while' / 'until' / 'continue' / 'break' / 'return'
comment = (line_comment / block_comment)
line_comment = '//' [^\n\r]* [\n\r]*
block_comment = '/*' (!'*/' .)* '*/'
_ = ([ \t\n\r] / comment)*
__ = ([ \t\n\r] / comment)+
/**** syntax smoke tests ****/
/* block comment */
// line comment
/**** literals ****/
nil
true false
e pi nan; +123; -123. 123.456 .456 .456e+78 .456e-78 .456e78
'' 'single-quoted string' 'escape sequences \b\f\n\r\t\v\0\'\"\\ \x12 \u1234 '
"" "double-quoted string" "escape sequences \b\f\n\r\t\v\0\'\"\\ \x12 \u1234 "
'''
single-quoted text
with ' " and escape sequences \b\f\n\r\t\v\0\'\"\\ \x12 \u1234
'''
"""
double-quoted text
with ' " and escape sequences \b\f\n\r\t\v\0\'\"\\ \x12 \u1234
"""
/**** variable definition and access ****/
a b := true local c := nil global d := pi
a := () => {}
b := (a) => { a }
c := (a,b,c) => { a+b+c }
a := native () => ''
b := native (a) => "return a"
c := native (a,b,c) => "return a+b+c"
/**** statements ****/
if (a = b) then { true }
if (a = b) then { true } else { false }
select {
when (a > b) then { 'greater than' }
when (a = b) then { 'equal' }
otherwise { 'smaller than' }
}
for i from 0 to 9 repeat { i }
for i from 9 down to 0 by -1 repeat { i }
i := 0
while (i < 9) repeat { i := i+1 }
repeat { i := i-1 } until (i = 0)
break continue return
/**** expressions ****/
a + b - c * d / e div f mod g ^ h
a < b b <= c c = d d >= e e > f f <> g
not a or b and not c
With a few additional JavaScript decorations, this grammar is able to output an "abstract syntax tree" (AST).
{
let whitespace = {}
function without_whitespace (list) {
let result = []
for (let i = 0, l = list.length; i < l; i++) {
switch (true) {
case (list[i] === whitespace):
continue
case Array.isArray(list[i]):
result.push(without_whitespace(list[i]))
break
default:
result.push(list[i])
}
}
return result
}
let name_of = Object.assign(Object.create(null),{
'and':'and', 'or':'or', 'not':'not',
'<':'lt', '<=':'le', '=':'eq', '>=':'ge', '>':'gt',
'+':'plus', '-':'minus', '*':'times', '/':'through', 'div':'div', 'mod':'mod',
'^':'raised'
})
function prefixed (operators, operand) {
return (
operators == null
? operand
: operators.reduceRight(
(result,operator) => ['#call',name_of[operator],result], operand
)
)
}
function left_associative (left_operand, operations) {
return (
operations == null
? left_operand
: without_whitespace(operations).reduce(
(result,operation) => ['#call',name_of[operation[0]],result,operation[1]],
left_operand
)
)
}
function right_associative (left_operand, operation) {
return (
operation == null
? left_operand
: (
operation = without_whitespace(operation),
['#call',name_of[operation[0]],left_operand,operation[1]]
)
)
}
function unescaped (char_list) {
return char_list.join('').replace(/\\[bfnrtv0'"\\]|\\x[0-9]{2}|\\u[0-9a-f]{4}/gi,(match) => {
switch (match.charAt(1)) {
case 'b': return '\b'; case 'f': return '\\f'
case 'n': return '\n'; case 'r': return '\\r'
case 't': return '\t'; case 'v': return '\\v'
case '0': return '\0'; case "'": return "'"
case '"': return '"'; case '\\': return '\\'
case 'x':
case 'u': return String.fromCharCode(parseInt(match.slice(2),16))
}
})
}
}
script = statements:statement_list? _ { return statements }
block = '{' statements:statement_list _ '}'
{ return ['#block',statements] }
statement_list = statements:(_ statement / _ ';' )*
{ return statements.filter((stmt) => stmt[1] !== ';').map((stmt) => stmt[1]) }
statement = if_statement / select_statement
/ for_statement / while_statement / until_statement
/ continue_statement / break_statement / return_statement
/ expression
if_statement = 'if' _ '(' _ condition:expression _ ')' _ 'then' _ then_clause:block
else_clause:(_ 'else' _ block:block { return block })?
{ return ['#if_then_else', condition, then_clause, else_clause] }
select_statement = 'select' _ '{' _ when_clauses:when_clause* otherwise_clause:otherwise_clause? '}'
{ return ['#select_when', otherwise_clause].concat(when_clauses) }
when_clause = 'when' _ '(' _ condition:expression _ ')' _ 'then' _ block:block _
{ return [condition,block] }
otherwise_clause = 'otherwise' (__ 'do')? _ block:block _ { return block }
for_statement = 'for' __ identifier:identifier _ 'from' alone _ start_value:expression _
downwards:('down' _ { return true })? 'to' alone _ stop_value:expression _
step_value:('by' alone _ value:expression { return value })? _
'repeat' _ loop_body:block
{ return ['#for_repeat', identifier, ['#value',!!downwards], start_value, stop_value, step_value, loop_body] }
while_statement = 'while' _ '(' _ expression:expression _ ')' _ 'repeat' _ block:block
{ return ['#while_repeat',['#block',expression],block] }
until_statement = 'repeat' _ block:block _ 'until' _ '(' _ expression:expression _ ')'
{ return ['#repeat_until',block,['#block',expression]] }
continue_statement = 'continue' alone { return ['#continue'] }
break_statement = 'break' alone { return ['#break'] }
return_statement = 'return' alone _ argument:expression? { return ['#return'].concat(argument || []) }
expression = assignment / or_term
or_term = left_operand:and_term operations:(_ 'or' alone _ and_term)*
{ return left_associative(left_operand,operations) }
and_term = left_operand:not_term operations:(_ 'and' alone _ not_term)*
{ return left_associative(left_operand,operations) }
not_term = operators:('not' alone _)* operand:comparison
{ return prefixed(operators,operand) }
comparison = left_operand:additive_term operations:(_ ('<' ![=>] / '<=' / '=' !'>' / '>=' / '>' !'=' / '<>') _ additive_term)?
{ return left_associative(left_operand,operations) }
additive_term = left_operand:multiplicative_term operations:(_ ('+' / '-') _ multiplicative_term)*
{ return left_associative(left_operand,operations) }
multiplicative_term = left_operand:exponential_term operations:(_ ('*' / '/' / 'div' alone / 'mod' alone) _ exponential_term)*
{ return left_associative(left_operand,operations) }
exponential_term = left_operand:invocation operation:(_ '^' _ exponential_term)?
{ return right_associative(left_operand,operation) }
invocation = callee:primary args:( _ '(' _ argument_list? _ ')')?
{ return (args == null ? callee : ['#call',callee].concat(args[3] || [])) }
primary = literal
/ 'global' __ identifier:identifier { return ['#get-global',identifier] }
/ 'local' __ identifier:identifier { return ['#get-local',identifier] }
/ identifier:identifier { return ['#get-var',identifier] }
/ '(' _ expression:expression _ ')' ! (_ '=>') { return expression }
/ lambda_definition / native_definition
assignment = global_assignment / local_assignment / generic_assignment
global_assignment = 'global' __ key:identifier _ ':=' _ value:expression
{ return ['#set-global',key,value] }
local_assignment = 'local' __ key:identifier _ ':=' _ value:expression
{ return ['#set-local',key,value] }
generic_assignment = key:identifier _ ':=' _ value:expression
{ return ['#set-var',key,value] }
lambda_definition = '(' _ parameters:parameter_list? _ ')' _ '=>' _ body:block
{ return ['#lambda',parameters,body] }
native_definition = 'native' _ '(' _ parameters:parameter_list? _ ')' _ '=>' _ body:string
{ return ['#native',parameters,body] }
parameter_list = identifier:identifier identifiers:(_ ',' _ identifier)*
{ return [identifier].concat((identifiers || []).map((list) => list[3])) }
argument_list = expression:expression expressions:(_ ',' _ expression)*
{ return [expression].concat((expressions || []).map((list) => list[3])) }
literal = value:('nil' alone { return undefined } / boolean / number / string)
{ return ['#value',value] }
boolean = 'true' alone { return true }
/ 'false' alone { return false }
number = integer / floating_point / 'nan' alone { return NaN }
integer = binary { return parseInt(text().slice(2),2) }
/ decimal { return parseFloat(text()) }
/ hexadecimal { return parseInt(text().slice(2),16) }
binary = '0b' [01]+
decimal = [+-]? digit+ ! '.'
hexadecimal = '0x' hex_digit+
floating_point = mantissa exponent?
mantissa = [+-]? (digit+ '.' digit* / '.' digit+)
exponent = [eE] [+-]? digit+
string = single_quoted_text / double_quoted_text /
single_quoted_string / double_quoted_string
single_quoted_string = "'" content:(escape_sequence / & no_control_char [^\'])* "'" { return unescaped(content) }
double_quoted_string = '"' content:(escape_sequence / & no_control_char [^\"])* '"' { return unescaped(content) }
single_quoted_text = "'''" content:(escape_sequence / !"'''" [\'] / [^\'])* "'''" { return unescaped(content) }
double_quoted_text = '"""' content:(escape_sequence / !'"""' [\"] / [^\"])* '"""' { return unescaped(content) }
escape_sequence = ('\\' [bfnrtv0'"\\]) / ('\\x' hex_digit{2}) / ('\\u' hex_digit{4})
no_control_char = [^\x00-\x1F\x7F-\x9F\u200B\u2028\u2029\u2060]
digit = [0-9]
hex_digit = [0-9a-fA-F]
identifier = ! (reserved_word alone) identifier_start identifier_part*
{ return ['#value',text()] } /* makes control statements callable */
identifier_start = [$_a-zA-Z]
identifier_part = [$_a-zA-Z0-9]
alone = ! identifier_part
reserved_word = 'global' / 'local' / 'native'
/ 'nil' / 'true' / 'false' / 'nan'
/ 'div' / 'mod' / 'and' / 'or' / 'not'
/ 'if' / 'then' / 'else' / 'select' / 'when' / 'otherwise'
/ 'for' / 'from' / 'down' / 'to' / 'by' / 'repeat'
/ 'while' / 'until' / 'continue' / 'break' / 'return'
comment = (line_comment / block_comment) { return whitespace }
line_comment = '//' [^\n\r]* [\n\r]*
block_comment = '/*' (!'*/' .)* '*/'
_ = ([ \t\n\r] / comment)* { return whitespace }
__ = ([ \t\n\r] / comment)+ { return whitespace }
It is not far from the AST output to a full-fledged interpreter - however, some JavaScript code must be written for this, so the domain of pure syntax specification is left behind.
{
let whitespace = {}
function without_whitespace (list) {
let result = []
for (let i = 0, l = list.length; i < l; i++) {
switch (true) {
case (list[i] === whitespace):
case (list[i] === undefined):
continue
case Array.isArray(list[i]):
result.push(without_whitespace(list[i]))
break
default:
result.push(list[i])
}
}
return result
}
/**** constructor functions ****/
function form (type, items) {
Object.assign(this,{ type,items })
}
function block (statement_list, context) {
Object.assign(this,{ statement_list,context })
}
function lambda (parameter_list, statement_list) {
Object.assign(this,{ parameter_list, statement_list })
}
function new_native (parameter_list, native_body) {
return (
parameter_list.length === 0
? new Function(native_body)
: new Function(parameter_list.join(),native_body)
)
}
/**** operator handling ****/
let name_of = Object.assign(Object.create(null),{
'and':'and', 'or':'or', 'not':'not',
'<':'lt', '<=':'le', '=':'eq', '>=':'ge', '>':'gt', '<>':'ne',
'+':'plus', '-':'minus', '*':'times', '/':'through', 'div':'div', 'mod':'mod',
'neg':'neg', '^':'raised'
})
function prefixed (operators, operand) {
return (
operators.length === 0
? operand
: without_whitespace(operators).reduceRight(
(result,operator) => new form('#call-global',[name_of[operator],result]), operand
)
)
}
function left_associative (left_operand, operations) {
return (
operations.length === 0
? left_operand
: without_whitespace(operations).reduce(
(result,operation) => new form('#call-global',[name_of[operation[0]],result,operation[1]]),
left_operand
)
)
}
function right_associative (left_operand, operation) {
return (
operation == null
? left_operand
: (
operation = without_whitespace(operation),
new form('#call-global',[name_of[operation[0]],left_operand,operation[1]])
)
)
}
function unescaped (text) {
return text.replace(/\\[bfnrtv0'"\\]|\\x[0-9]{2}|\\u[0-9a-f]{4}/gi,(match) => {
switch (match.charAt(1)) {
case 'b': return '\b'; case 'f': return '\\f'
case 'n': return '\n'; case 'r': return '\\r'
case 't': return '\t'; case 'v': return '\\v'
case '0': return '\0'; case "'": return "'"
case '"': return '"'; case '\\': return '\\'
case 'x':
case 'u': return String.fromCharCode(parseInt(match.slice(2),16))
}
})
}
function flattened (list) {
return list.reduce((result,sublist) => result.concat(sublist),[])
}
/**** script evaluation ****/
function evaluated_script (statement_list) {
let context = Object.create(global_context)
let callee = new block(statement_list,context)
activation_stack.push({ callee,context })
let result
try {
result = executed_block(callee,[],context)
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
throw new Error('no loop to be continued')
case (signal instanceof loop_termination):
throw new Error('no loop to be terminated')
case (signal instanceof function_termination):
throw new Error('cannot "return" from script')
case (signal instanceof function_replacement):
throw new Error('cannot "return" from script')
default:
throw signal
}
}
activation_stack.pop()
console.log('script result',result)
return result
}
/**** runtime system ****/
let global_context = Object.create(null)
let activation_stack = []
function context_contains (context, var_name) {
return Object.prototype.hasOwnProperty.call(context,var_name)
}
function evaluated (value, context) {
return (value instanceof form ? evaluated_form(value,context) : value)
}
function evaluated_form (form, context) {
let callee, value_list
switch (form.type) {
case '#block': return new block(form.items,context)
case '#get-var': return context[form.items[0]]
case '#set-var': return set_var(context,form.items[0],evaluated(form.items[1],context))
case '#get-global': return global_context[form.items[0]]
case '#set-global': return global_context[form.items[0]] = evaluated(form.items[1],context)
case '#get-local': return (context_contains(context,form.items[0]) ? context[form.items[0]] : undefined)
case '#set-local': return context[form.items[0]] = evaluated(form.items[1],context)
case '#call':
value_list = form.items.map((value) => evaluated(value,context))
return executed(value_list[0],value_list.slice(1))
case '#call_tce':
value_list = form.items.map((value) => evaluated(value,context))
throw new function_replacement(value_list[0],value_list.slice(1))
case '#call-global':
callee = global_context[form.items[0]]
value_list = form.items.slice(1).map((value) => evaluated(value,context))
return executed(callee,value_list)
default: throw new Error('unforeseen form type "' + form.type + '"')
}
}
function executed (callee, argument_list) {
for (;;) {
let context, stack_depth, result
try {
switch (true) {
case (typeof callee === 'function'):
return callee.apply(activation_stack,argument_list)
case (callee instanceof block):
context = Object.create(callee.context)
stack_depth = activation_stack.length
activation_stack.push({ callee,context })
result = executed_block(callee,argument_list,context)
activation_stack.pop()
return result
case (callee instanceof lambda):
context = Object.create(global_context)
stack_depth = activation_stack.length
activation_stack.push({ callee, context })
result = executed_block(callee,argument_list,context)
activation_stack.pop()
return result
default:
console.log('cannot execute',callee,'(',argument_list,')')
throw new Error('cannot execute value of type "' + typeof callee + '"')
}
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
if (callee instanceof lambda) {
throw new Error('no loop to be continued')
}
throw signal
case (signal instanceof loop_termination):
if (callee instanceof lambda) {
throw new Error('no loop to be terminated')
}
throw signal
case (signal instanceof function_termination):
if (callee instanceof lambda) {
activation_stack.length = stack_depth
return signal.value
}
throw signal
case (signal instanceof function_replacement):
if (callee instanceof lambda) {
activation_stack.length = stack_depth
callee = signal.callee; argument_list = signal.argument_list
continue
}
throw signal
default: throw signal
}
}
}
}
function executed_block (callee, argument_list, context) {
if (callee.parameter_list != null) {
callee.parameter_list.forEach((parameter,index) => {
context[parameter] = argument_list[index]
})
}
let statement_list = callee.statement_list
for (let i = 0, l = statement_list.length; i < l; i++) {
context.it = evaluated(statement_list[i],context)
}
return context.it
}
function set_var (context, var_name, var_value) {
let original_context = context
while (context !== global_context) {
if (context_contains(context,var_name)) { return context[var_name] = var_value }
context = Object.getPrototypeOf(context)
}
return original_context[var_name] = var_value
}
/**** special exceptions ****/
function loop_continuation () { /* nop */ }
function loop_termination (value) { this.value = value }
function function_termination (value) { this.value = value }
function function_replacement (callee, argument_list) {
Object.assign(this,{ callee,argument_list })
}
/**** auxiliary functions ****/
function active_context () {
return activation_stack[activation_stack.length-1].context
}/**** runtime environment ****/
function throwError (message) { throw new Error(message) }
function boolean_type (a) {
return (typeof a === 'boolean') ||
throwError('boolean value expected (got ' + typeof a + ')')
}
function boolean_types (a,b) {
return boolean_type(a) || boolean_type(b)
}
function numeric_type (a) {
return (typeof a === 'number') ||
throwError('numeric value expected (got ' + typeof a + ')')
}
function numeric_types (a,b) {
return numeric_type(a) || numeric_type(b)
}
function block_type (a) {
return (a instanceof block) ||
throwError('statement block expected')
}
function identifier_type (a) {
return (typeof a === 'string') && /^[$_a-z][$_a-z0-9]*$/i.test(a) ||
throwError('valid identifier expected (got ' + typeof a + ')')
}
function same_types (a,b) {
return (typeof a === typeof b) ||
throwError('values are of different types (' + typeof a + ' <> ' + typeof b + ')')
}
function addable_types (a,b) {
return (
((typeof a === 'number') || (typeof a === 'string')) &&
((typeof b === 'number') || (typeof b === 'string')) ||
throwError('numeric or literal values expected')
)
}
let assertion_counter = 0
Object.assign(global_context,{
nil:undefined,
true:true, false:false,
plus: (a,b) => addable_types(a,b) && (a+b),
minus: (a,b) => numeric_types(a,b) && (a-b),
times: (a,b) => numeric_types(a,b) && (a*b),
through: (a,b) => numeric_types(a,b) && (a/b),
div: (a,b) => numeric_types(a,b) && Math.trunc(a/b),
mod: (a,b) => numeric_types(a,b) && (a%b),
raised: (a,b) => numeric_types(a,b) && Math.pow(a,b),
neg: (a) => numeric_type(a) && (-a),
lt: (a,b) => same_types(a,b) && (a < b),
le: (a,b) => same_types(a,b) && (a <= b),
eq: (a,b) => (a === b),
ge: (a,b) => same_types(a,b) && (a >= b),
gt: (a,b) => same_types(a,b) && (a > b),
ne: (a,b) => (a !== b),
and: (a,b) => boolean_types(a,b) && (a && b),
or: (a,b) => boolean_types(a,b) && (a || b),
not: (a) => boolean_type(a) && (! a),
pi: Math.PI,
e: Math.E,
is_nan: isNaN,
sqrt: (a) => numeric_type(a) && Math.sqrt(a),
sin: (a) => numeric_type(a) && Math.sin(a),
cos: (a) => numeric_type(a) && Math.cos(a),
tan: (a) => numeric_type(a) && Math.tan(a),
rnd: (a) => numeric_type(a) && (Math.random() * a),
if_then_else: (condition, then_clause, else_clause) => {
switch (condition) {
case true: return block_type(then_clause) && executed(then_clause,[])
case false: return (
else_clause == null
? undefined
: block_type(else_clause) && executed(else_clause,[])
)
default: throw new Error('boolean value expected')
}
},
select_when: function (otherwise_clause /* condition,block... */) {
let clause_list = Array.prototype.slice.call(arguments,1)
for (let i = 0, l = clause_list.length; i < l; i += 2) {
let condition = clause_list[i]
if (
boolean_type(
condition instanceof block
? condition = executed(condition,[])
: condition
) && (condition == true)
) {
return executed(clause_list[i+1],[])
}
}
return otherwise_clause == null ? undefined : executed(otherwise_clause,[])
},
for_repeat: (identifier, downwards, start_value, stop_value, step_value, loop_body) => {
identifier_type(identifier) && boolean_type(downwards) &&
numeric_type(start_value) && numeric_type(stop_value) &&
((step_value == null) || numeric_type(step_value)) &&
block_type(loop_body)
if (step_value === 0) { throw new Error('step value must not be 0') }
if (step_value == null) { step_value = (downwards ? -1 : 1) }
let callee = global_context.for_repeat
let context = Object.create(active_context())
activation_stack.push({ callee,context }); let stack_depth = activation_stack.length
let i, to_be_continued = () => (downwards ? i >= stop_value : i <= stop_value)
for (i = start_value; to_be_continued(); i += step_value) {
context[identifier] = i
try {
executed_block(loop_body,[],context)
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
activation_stack.length = stack_depth
break
case (signal instanceof loop_termination):
activation_stack.length = stack_depth-1
return signal.value
default: throw signal
}
}
}
activation_stack.pop()
return context.it
},
while_repeat: (condition, loop_body) => {
block_type(condition) && block_type(loop_body)
let callee = global_context.while_repeat
let context = Object.create(active_context())
activation_stack.push({ callee,context }); let stack_depth = activation_stack.length
for (;;) {
try {
let condition_value = executed_block(condition,[],context)
if (boolean_type(condition_value) && (condition_value == false)) { break }
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
throw new Error('must not "continue" in loop condition')
case (signal instanceof loop_termination):
throw new Error('must not "break" in loop condition')
default: throw signal
}
}
try {
executed_block(loop_body,[],context)
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
activation_stack.length = stack_depth
continue
case (signal instanceof loop_termination):
activation_stack.length = stack_depth-1
return signal.value
default: throw signal
}
}
}
activation_stack.pop()
return context.it
},
repeat_until: (loop_body, condition) => {
block_type(loop_body)
let callee = global_context.while_repeat
let context = Object.create(active_context())
activation_stack.push({ callee,context }); let stack_depth = activation_stack.length
for (;;) {
try {
executed_block(loop_body,[],context)
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
activation_stack.length = stack_depth
break
case (signal instanceof loop_termination):
activation_stack.length = stack_depth-1
return signal.value
default: throw signal
}
}
try {
let condition_value = executed_block(condition,[],context)
if (boolean_type(condition_value) && (condition_value == false)) { break }
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
throw new Error('must not "continue" in loop condition')
case (signal instanceof loop_termination):
throw new Error('must not "break" in loop condition')
default: throw signal
}
}
}
activation_stack.pop()
return context.it
},
'break': (value) => {
throw new loop_termination( value == null ? undefined : value )
},
'continue': () => {
throw new loop_continuation()
},
'return': (value) => {
throw new function_termination( value == null ? undefined : value )
},
assert: (value) => {
assertion_counter++
if (value === true) {
return true
} else {
throw new Error('assertion #' + assertion_counter + ' failed')
}
},
assert_failure: (block) => {
block_type(block)
assertion_counter++
try {
executed(block,[])
} catch (signal) { return true }
throw new Error(
'assertion #' + assertion_counter + ' failed (block did not fail)'
)
},
log: (value) => {
console.log(value)
}
})
}
script = statements:statement_list _
{ console.log('statements',statements); return evaluated_script(statements) }
block = '{' statements:statement_list _ '}'
{ return new form('#block',statements) }
statement_list = statements:(_ statement / _ ';' )*
{ return statements.filter((stmt) => stmt[1] !== ';').map((stmt) => stmt[1]) }
statement = if_statement / select_statement
/ for_statement / while_statement / until_statement
/ continue_statement / break_statement / return_statement
/ expression
if_statement = 'if' _ '(' _ condition:expression _ ')' _ 'then' _ then_clause:block
else_clause:(_ 'else' _ block:block { return block })?
{ return new form('#call-global',['if_then_else',condition,then_clause,else_clause]) }
select_statement = 'select' _ '{' _ when_clauses:when_clause* otherwise_clause:otherwise_clause? '}'
{ return new form('#call-global',['select_when',otherwise_clause].concat(flattened(when_clauses))) }
when_clause = 'when' _ '(' _ condition:expression _ ')' _ 'then' _ block:block _
{ return [condition,block] }
otherwise_clause = 'otherwise' (__ 'do')? _ block:block _ { return block }
for_statement = 'for' __ identifier:identifier _ 'from' alone _ start_value:expression _
downwards:('down' _ { return true })? 'to' alone _ stop_value:expression _
step_value:('by' alone _ value:expression { return value })? _
'repeat' _ loop_body:block
{ return new form('#call-global',['for_repeat',identifier,!!downwards,start_value,stop_value,step_value,loop_body]) }
while_statement = 'while' _ '(' _ expression:expression _ ')' _ 'repeat' _ loop_body:block
{ return new form('#call-global',['while_repeat',new form('#block',[expression]),loop_body]) }
until_statement = 'repeat' _ loop_body:block _ 'until' _ '(' _ expression:expression _ ')'
{ return new form('#call-global',['repeat_until',loop_body,new form('#block',[expression])]) }
continue_statement = 'continue' alone { return new form('#call-global',['continue']) }
break_statement = 'break' alone (_ result:expression { return result })?
{ return new form('#call-global',['break',result]) }
return_statement = 'return' alone _ result:expression?
{
if ((result instanceof form) && (result.type === '#call')) {
result.type = '#call_tce'
}
return new form('#call-global',['return',result])
}
expression = assignment / or_term
or_term = left_operand:and_term operations:(_ 'or' alone _ and_term)*
{ return left_associative(left_operand,operations) }
and_term = left_operand:not_term operations:(_ 'and' alone _ not_term)*
{ return left_associative(left_operand,operations) }
not_term = operators:('not' alone _)* operand:comparison
{ return prefixed(operators,operand) }
comparison = left_operand:additive_term operation:(_ ('<' ![=>] { return '<' } / '<=' / '=' !'>' { return '=' } / '>=' / '>' !'=' { return '>' } / '<>') _ additive_term)?
{ return left_associative(left_operand,operation == null ? [] : [operation]) }
additive_term = left_operand:multiplicative_term operations:(_ ('+' / '-') _ multiplicative_term)*
{ return left_associative(left_operand,operations) }
multiplicative_term = left_operand:negation_term operations:(_ ('*' / '/' / 'div' alone { return 'div' } / 'mod' alone { return 'mod' }) _ negation_term)*
{ return left_associative(left_operand,operations) }
negation_term = operators:('-' ! (_ digit) _ { return 'neg' })* operand:exponential_term
{ return prefixed(operators,operand) }
exponential_term = left_operand:invocation operation:(_ '^' _ exponential_term)?
{ return right_associative(left_operand,operation) }
invocation = callee:primary args:( _ '(' _ argument_list? _ ')')?
{ return (args == null ? callee : new form('#call',[callee].concat(args[3] || []))) }
primary = literal
/ 'global' __ identifier:identifier { return new form('#get-global',[identifier]) }
/ 'local' __ identifier:identifier { return new form('#get-local',[identifier]) }
/ identifier:identifier { return new form('#get-var',[identifier]) }
/ '(' _ expression:expression _ ')' ! (_ '=>') { return expression }
/ lambda_definition / native_definition
assignment = global_assignment / local_assignment / generic_assignment
global_assignment = 'global' __ key:identifier _ ':=' _ value:expression
{ return new form('#set-global',[key,value]) }
local_assignment = 'local' __ key:identifier _ ':=' _ value:expression
{ return new form('#set-local',[key,value]) }
generic_assignment = key:identifier _ ':=' _ value:expression
{ return new form('#set-var',[key,value]) }
lambda_definition = '(' _ parameters:parameter_list? _ ')' _ '=>' _ body:block
{ return new lambda(parameters || [],body.items) }
native_definition = 'native' _ '(' _ parameters:parameter_list? _ ')' _ '=>' _ body:string
{ return new_native(parameters || [],body) }
parameter_list = identifier:identifier identifiers:(_ ',' _ identifier)*
{ return [identifier].concat((identifiers || []).map((list) => list[3])) }
argument_list = expression:expression expressions:(_ ',' _ expression)*
{ return [expression].concat((expressions || []).map((list) => list[3])) }
literal = 'nil' alone { return undefined }
/ boolean / number / string
/ block
boolean = 'true' alone { return true }
/ 'false' alone { return false }
number = binary / hexadecimal / decimal / 'nan' alone { return NaN }
binary = '0b' [01]+ { return parseInt(text().slice(2),2) }
hexadecimal = '0x' hex_digit+ { return parseInt(text().slice(2),16) }
decimal = mantissa exponent? { return parseFloat(text()) }
mantissa = [+-]? (digit+ ('.' digit*)? / '.' digit+)
exponent = [eE] [+-]? digit+
string = single_quoted_text / double_quoted_text /
single_quoted_string / double_quoted_string
single_quoted_string = "'" content:(escape_sequence / & no_control_char char:[^\'] { return char })* "'" { return unescaped(content.join('')) }
double_quoted_string = '"' content:(escape_sequence / & no_control_char char:[^\"] { return char })* '"' { return unescaped(content.join('')) }
single_quoted_text = "'''" content:(escape_sequence / !"'''" char:[\'] { return char } / [^\'])* "'''" { return unescaped(content.join('')) }
double_quoted_text = '"""' content:(escape_sequence / !'"""' char:[\"] { return char } / [^\"])* '"""' { return unescaped(content.join('')) }
escape_sequence = (('\\' [bfnrtv0'"\\]) / ('\\x' hex_digit{2}) / ('\\u' hex_digit{4}))
{ return text() }
no_control_char = [^\x00-\x1F\x7F-\x9F\u200B\u2028\u2029\u2060]
digit = [0-9]
hex_digit = [0-9a-fA-F]
identifier = ! (reserved_word alone) identifier_start identifier_part*
{ return text() }
identifier_start = [$_a-zA-Z]
identifier_part = [$_a-zA-Z0-9]
alone = ! identifier_part
reserved_word = 'global' / 'local' / 'native'
/ 'nil' / 'true' / 'false' / 'nan'
/ 'div' / 'mod' / 'and' / 'or' / 'not'
/ 'if' / 'then' / 'else' / 'select' / 'when' / 'otherwise'
/ 'for' / 'from' / 'down' / 'to' / 'by' / 'repeat'
/ 'while' / 'until' / 'continue' / 'break' / 'return'
comment = (line_comment / block_comment) { return whitespace }
line_comment = '//' [^\n\r]* [\n\r]*
block_comment = '/*' (!'*/' .)* '*/'
_ = ([ \t\n\r] / comment)* { return whitespace }
__ = ([ \t\n\r] / comment)+ { return whitespace }
The following examples show the use of different values, operators and statements - and at the same time serve as a kind of "smoke test" for the interpreter
assert(nil = nil)
assert(true)
assert(not false)
assert(pi > 3)
assert(e > 2)
assert(is_nan(nan))
assert(0b1001 = 9)
assert(0x11 = 17)
assert(-123 < 0)
assert(+123.456 > 123)
assert(.123e3 = 123)
assert('Test' = "Test")
assert('''
Test
''' = """
Test
""")
assert('''
Test
''' = '\n Test\n ')
global test := 'Test ' + (0b0001+2*0x03)
assert(test = 'Test 7')
assert((true and false) = false)
assert((true or false) = true)
assert((not true) = false)
assert(0 < 1)
assert(0 <= 0)
assert(1 = 1)
assert(1 >= 1)
assert(1 > 0)
assert(1 <> 0)
assert(1 + 1 = 2)
assert(1 - 1 = 0)
assert(1 * 1 = 1)
assert(1 / 1 = 1)
assert(3 div 2 = 1)
assert(3 mod 2 = 1)
assert(-(1) < 0)
assert(2 ^ 3 = 8)
global a := 'global a'
a := 'a'
{
assert(a = 'a')
assert(global a = 'global a')
a := 'context a'
assert(a = 'context a')
assert(global a = 'global a')
local a := 'local a'
assert(a = 'local a')
assert(global a = 'global a')
}()
assert(a = 'context a')
a := 'test'
block := { b := 'function ' b + a }
assert (block() = 'function test')
fn := (a,b) => { return a + 'function ' + b }
assert (fn('this is a ',a) = 'this is a function test')
fn := native (a,b) => "return a + 'function ' + b"
assert (fn('this is a ',a) = 'this is a function test')
a := 0
if (a = 0) then { 'zero' } else { 'not zero' }
assert(it = 'zero')
if (a < 0) then { 'negative' } else { 'not negative' }
assert(it = 'not negative')
select {
when (a < 0) then { 'negative' }
when (a = 0) then { 'zero' }
otherwise { 'positive' }
}
assert(it = 'zero')
test := ''
for i from 1 to 10 by 2 repeat { test := test + i }
assert(test = '13579')
test := ''
for i from 9 down to 1 by -2 repeat { test := test + i }
assert(test = '97531')
test := '' i := 1
while (i < 10) repeat {
if (i mod 2 = 0) then { test := test + i }
i := i + 1
}
assert(test = '2468')
test := '' i := 1
repeat {
if (i mod 2 = 0) then { test := test + i }
i := i + 1
} until (i < 10)
assert(test = '2468')
test := ''
for i from 1 to 10 repeat {
if (i mod 2 = 0) then { continue }
test := test + i
}
assert(test = '13579')
test := '' i := 1
while (i < 100) repeat {
if (i mod 2 = 0) then { test := test + i }
i := i + 1
if (i > 9) then { break } else { continue }
}
assert(test = '2468')
One of the special features of "slang" is the elimination of "tail calls" and especially "tail recursions" - in this way, loops can be safely replaced by recursions (similar to some strictly functional programming languages).
Among the classic examples of functions that can be rewritten for tail recursion elimination are the mathematical factorial and the calculation of the Fibonacci series - even though modern languages (such as JavaScript) are nowadays more likely to reach the limit of the numerical value range on powerful computers than to encounter problems with recursion.
native_factorial := native (n) => '''
function factorial (n) {
return (n <= 1 ? 1 : n * factorial(n-1))
}
return factorial(n)
'''
global factorial := (n) => {
if (n <= 1) then { 1 } else { n * factorial(n-1) }
}
assert(native_factorial(170) = factorial(170))
native_fibonacci := native (n) => '''
function fibonacci (n, aux1, aux2) {
if (n <= 1) {
return aux1 + aux2
} else {
return fibonacci(n-1, aux1 + aux2, aux1)
}
}
return fibonacci(n-1,1,0)
'''
global fibonacci := (n, aux1, aux2) => {
if ((aux1 = nil) or (aux2 = nil)) then {
return fibonacci(n-1,1,0)
} else {
if (n <= 1) then {
return aux1 + aux2
} else {
return fibonacci(n-1, aux1 + aux2, aux1)
}
}
}
assert(native_fibonacci(1476) = fibonacci(1476))
If you want to bring a modern computer to its knees, you have to resort to more "violent" methods - e.g. using recursions instead of loops.
native_sum_up := native (n, sum) => '''
function sum_up (n, sum) {
return (n <= 0 ? sum : sum_up(n-1,sum+n))
}
return sum_up(n,sum)
'''
native_sum_up(100000,0)
// Maximum call stack size exceeded
global sum_up := (n, sum) => {
if (n <= 0) then { sum } else { return sum_up(n-1,sum+n) }
}
sum_up(100000,0)
This web page uses the following third-party libraries, assets or StackOverflow answers:
The author would like to thank the developers and authors of the above-mentioned contributions for their effort and willingness to make their works available to the general public.