Skip to content
mdaines edited this page Mar 4, 2023 · 11 revisions

How to Write Grammars

This assumes some familiarity with context-free grammars. Please refer to your textbook or another resource, like Wikipedia's article about the subject.

Grammars are written similarly to the usual presentation in textbooks. Each rule consists of a nonterminal symbol, an arrow, the symbols the rule produces, and then a period. Alternative productions for a nonterminal can be written using multiple statements or written together using a vertical bar.

A -> a B .
A -> a .
B -> A | b .

You can write rules over multiple lines -- spacing doesn't matter.

A -> b
     |
     A b .

Symbols are written like programming language identifiers. They can start with an ASCII letter, an underscore, or dollar sign. This can be followed by the same, plus numbers. For example:

A -> A_1 a .
A_1 -> $b _i18n .

Also, any quoted string can be a symbol. This is a valid grammar:

"音楽" -> "♫" "音楽" | .

Any symbol can be used as a nonterminal -- it just needs to be in the nonterminal position of a rule. (In other words, it's not required to capitalize a symbol for it to be a nonterminal.)

The first nonterminal is always the grammar's start symbol.

A rule that produces the empty string (sometimes written as A → ε) is just written with no symbols between the arrow and period. Grammophone will show this with the symbol ε.

A -> .

A symbol can't be an empty string. Grammophone won't accept this as a valid grammar:

# Invalid
A -> "" .

Instead of the arrow and period, you can use colon and semicolon, like in Bison. These are the same grammar:

A -> a | b .
A : a | b ;

You can add comments using the # character:

# This grammar is LL(1)

A -> b A | .

In Grammophone, the symbol Grammar.END is reserved for internal use.

Clone this wiki locally