Snowflake Cheat Sheet by Mackenzie High What is It?
Syntax
Named And-Predicate Rule
name = & item;
Named Character Rule
name = character-class;
Named Choice Rule
name = option1 / option2 / … / option n;
Named Not-Predicate Rule
name = ! operand;
Named Option Rule
name = operand ?;
Named Plus Rule
name = operand +;
Named Repetition Rule
name = operand { minimum , maximum };
Named Sequence Rule
name = element1 , element2 , … , element n;
Named Star Rule
name = operand *;
Named String Rule
name = “text”;
Anonymous And-Predicate Rule
( & item )
Anonymous Character Rule
( character-class )
Anonymous Choice Rule
( option1 / option2 / … / option
Anonymous Not-Predicate Rule
( ! operand )
Anonymous Option Rule
( operand ? )
Anonymous Plus Rule
( operand + )
Anonymous Repetition Rule
( operand { minimum , maximum } )
Anonymous Sequence Rule
( element1 , element2 , … , element
Anonymous Star Rule
( operand * )
Anonymous String Rule
“text”
End of Input Predicate
END
Character Class - Combination
[ class1 class2 … class n ]
Character Class - Exclusion
[ includes ^ excludes ]
Character Class - Negation
[ ^ class1 class2 … class n ]
Character Class - Range
minimum - maximum
Character Class - Single
'character1' OR digitsC
Directive - Root (required)
%root name;
Directive - Hide
%hide;
Directive - Package
%package “Fully Qualified Name of Package”;
Directive - Parser
%parser “Name of Parser Class”;
Directive - Trace
%trace digits;
Directive - Visitor
%visitor “Name of Visitor Class”;
Directive - Export Parser
%export-parser “filepath”;
Directive - Export Visitor
%export-visitor “filepath”;
n
)
n
)
Snowflake Cheat Sheet by Mackenzie High
Example Character Classes What does it match?
Character Class
any uppercase letter
'A' - 'Z'
any lowercase letter
'a' - 'z' [ 'A'-'Z'
any letter or digit
'a'-'z'
'0'-'9' ]
line feed (i.e “\n”)
10C
carriage return (i.e. '\r')
13C
space character
32C
tab character
9C
whitespace
[ 9C 10C 13C 32C ]
any uppercase letter except 'X' & 'Y'
[ 'A'-'Z' ^ 'X'-'Y' ]
any character
MIN-MAX
any character except 'M'
[^ 'M']
single character 'A'
'A'
single character 'A'
65C
Common Rules What does it match?
Rule WS = [ 9C 10C 13C 32C ] *;
whitespace
INT = ('-' ?) , ('0'-'9' +);
signed integer signed float, no exponent
FLOAT = ('-' ?) , ('0'-'9' +) , '.' , ('0'-'9' +); STRING = '"' , ([^ '"'] *) , '"';
string literal
About the Directives Directive %root name;
What does it do? Required. This directive tells the parser which grammar rule is the root.
%hide;
Optional. This directive causes the generated parser and generated visitor to have package-private access, rather than public access.
Snowflake Cheat Sheet by Mackenzie High %package “Fully Qualified Name of Package”;
Optional. This directive specifies the package that the generated parser and generated visitor will be located within.
%parser “Name of Parser Class”;
Optional. This directive specifies the name of the generated parser class.
%visitor “Name of Visitor Class”;
Optional. This directive directive specifies the name of the generated visitor class.
%trace digits;
Optional. This directive specifies the depth of the debugging back-trace. This directive is usually not very useful to end users.
%export-parser “filepath”;
Optional. This directive causes the generated parser to be written to a file.
%export-visitor “filepath”;
Optional. This directive causes the generated visitor to be written to a file.
Important Reminders •
PEG-based parsers succeed fast. In other words, if a prefix of the input matches the grammar, but the rest of the input does not match, then the parser will succeed without consuming all of the input. There is a simple fix for this, use the END predicate rule. This rule only succeeds, when the entire input has been successfully matched.
•
The GUI expects the input to contain line-feed (i.e. '\n') style newlines. The GUI is designed to ensure this across platforms. Thus, if you specifically design your grammar to use Windows newlines, then the grammar may seemingly fail to match the input. It is recommended that you design your grammar to accept the newline styles of multiple platforms.
•
In order to use the generated parser and generated visitor, “snowflake.jar” must be on the classpath of the program that uses the generated parser.
•
In a PEG-based parser, the lexer is integrated with the parser. As a result, it is necessary to use whitespace rules in your grammar, unless your input will never contain any whitespace.
Snowflake Cheat Sheet by Mackenzie High
Example Grammar # # # #
The input must be a phrase and the input must be fully consumed. A phrase can be either a greeting or parting phrase. A greeting phrase is the word "Hello" followed by the name of a terrestrial planet. A parting phrase is the word "Goodbye" followed by the name of a terrestrial planet.
%root input; input = phrase , END; phrase = greeting_phrase / parting_phrase; greeting_phrase = WS , "Hello" , WS , planet , WS; parting_phrase = WS , "Goodbye" , WS , planet, WS; planet = "Mercury" / "Venus" / "Earth" / "Mars"; # Whitespace: Zero-Or-More Tabs, Line-Feeds, Carriage-Returns, and/or Spaces WS = [9C 10C 13C 32C] *;
About the Rules Type of Rule
When does it successfully match?
Character Rule
The given character class matches the next input character.
String Rule
The given string of N characters matches the next N input characters.
Sequence Rule
Each of the rules in the sequence match in sequence.
Choice Rule
One of the options must match. Remember, the options are tried in the sequence that they appear in the rule. This is one of the major enhancements of PEG grammars.
Option Rule
An option-rule makes another rule optional.
Star Rule
The operand rule must be repeated zero-or-more times.
Plus Rule
The operand rule must be repeated one-or-more times.
Repetition Rule
The operand rule must be repeated between minimum and maximum times.
And Predicate
The operand rule must match, but will not be consumed.
Not Predicate
The operand rule must not match, but will not be consumed.
End of Input
The input must have already been fully consumed.