Skip to content

Showing how a grammar driven parser can be implemented

License

Notifications You must be signed in to change notification settings

FransFaase/RawParser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RawParser

This repository contains a single C-file, which, by example, shows how a grammar driven, scannerless parser can be implemented.

Grammar driven means that this is not a compiler that reads a grammar and either generates code (implementing the parser) or a set of tables to drive a parsing algorithm. An example of the latter is yacc/bison Instead the parsing algorithm directly operates on the grammar specification, which allows to implement a rich grammar with optional, sequential and grouping, of grammar elements. This specification is represented by structs that point to each other. The construction of the grammar is aided by some clever defines to increase readability.

Scannerless means that scanner specification is an intergral part of the grammar specification.

It is based on ideas I got from developing IParse. The first idea was to return to C. (The first version of IParse were written in C. See: http://www.iwriteiam.nl/MM.html) The second idea was to make it scannerless, whereas IParse uses hand-coded scanners. IParse does contain some low-level scanners and introduced the concept of character sets. A problem is that IParse has a build-in mechanism for creating abstract syntax trees (called abstract parse tree in the software) that does not allow to combine characters into basic values (as the hard-coded scanner can do). The idea is to abstract from this in the parser by the use of void pointers to the data being constructed during parsing and to make use of pointers to functions in the grammar that perform the construction at various points of the parsing process.

I am also structuring the code and adding a narrative in comments in order to make it more accessible and explain the various aspects of parsing a complex data structure from a textual representation. Examples of usage are given through out the code, which aims at implementing a complete parser for C like language. It seems by attempting this, it has improved the quality of the code as well. As with every attempt to write software, there are still many ad hoc decisions that are debatable.

Documentation

I have also started to document the code in a literate programming style with fragments of code that are extended in steps as described in Literate programming with Markdown. See:

Processing the documentation

The documentation can be processed with MarkDownC into C programs. (Following link for instructions on building MarkDownC). In the commands below, prefix the call to MarkDownC with the path where it can be found (if needed). Replace gcc with your prefered C-compiler.

The test documentation pages contain instructions for the command line arguments for the calls to MarkDownC

About

Showing how a grammar driven parser can be implemented

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages