-
Notifications
You must be signed in to change notification settings - Fork 24
Home
This document assumes some familiarity with context-free grammars. Please refer to your textbook or another resource, like Wikipedia's article about the subject.
Grammars are written similarly to the usual presentation in textbooks. Each rule consists of a nonterminal symbol, an arrow, the symbols the rule produces, and then a period. Alternative productions for a nonterminal can be written using multiple statements or written together using a vertical bar.
A -> a B .
A -> a .
B -> A | b .
You can write rules over multiple lines -- spacing doesn't matter.
A -> b
|
A b .
Symbols are written like programming language identifiers. They can start with an ASCII letter, an underscore, or dollar sign. This can be followed by the same, plus numbers. For example:
A -> A_1 a .
A_1 -> $b _i18n .
Also, any quoted string can be a symbol. This is a valid grammar:
"音楽" -> "♫" "音楽" | .
Any symbol can be used as a nonterminal -- it just needs to be in the nonterminal position of a rule. (In other words, it's not required to capitalize a symbol for it to be a nonterminal.)
The first nonterminal is always the grammar's start symbol.
A rule that produces the empty string (sometimes written as A → ε) is just written with no symbols between the arrow and period. Grammophone will show this with the symbol ε.
A -> .
A symbol can't be an empty string. Grammophone won't accept this as a valid grammar:
# Invalid
A -> "" .
Instead of the arrow and period, you can use colon and semicolon, like in Bison. These are the same grammar:
A -> a | b .
A : a | b ;
You can add comments using the # character:
# This grammar is LL(1)
A -> b A | .
In Grammophone, the symbol Grammar.END
is reserved for internal use.