Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Output rules used during parsing #72

Open
d4l3k opened this issue Jul 28, 2018 · 6 comments
Open

[Feature Request] Output rules used during parsing #72

d4l3k opened this issue Jul 28, 2018 · 6 comments

Comments

@d4l3k
Copy link
Contributor

d4l3k commented Jul 28, 2018

Sometimes with really large grammars (such as the wikitext/parsoid peg definitions) it's really hard to debug which rules are actually being used to output the final result. It'd be really nice if pigeon could return a tree of which rules are being used to generate the final result.

@d4l3k
Copy link
Contributor Author

d4l3k commented Jul 28, 2018

I wrote some code to modify the grammar and wrap all "run" functions with context information. Might be able to adapt something like this into a more general form.

https://github.com/d4l3k/wikigopher/blob/master/wikitext/debug.go

@breml
Copy link
Collaborator

breml commented Aug 6, 2018

@d4l3k I not yet understand, what particular functionality your are looking for. How is the tree you are requesting different (or related) to the output you get, if the Debug option is set for the parser?
Is your request only about the actions where Go code is executed? In your debug.go you only modify actionExpr, but you seem to ignore to other code expressions (like AndCodeExpr, NotCodeExpr, StateCodeExpr).
Are you looking for a way to overwrite (or provide your own version of) the callFuncTemplates (see https://github.com/mna/pigeon/blob/master/builder/builder.go#L35)?

@mna
Copy link
Owner

mna commented Aug 6, 2018

@breml I don't want to be speaking for the OP, but the way I understand it, this is to get an output of just the matched rules (the final tree of PEG rules used to generate the results), whereas the Debug option is very verbose and prints every rule attempted and bactracked (IIRC).

This does sound interesting and useful to me, especially if we could print out the matched input along with the rules (start-end index, and maybe skipping some middle runes when too big). It would still be quite verbose for big grammars, but I can see how it could help debugging a PEG that matches but doesn't give the expected result. I don't believe it could help with a PEG that doesn't match, since there's no matching rules to print out - the Debug would be better in this case.

Might be interesting to look at what other PEG generators do to assist in debugging though? This could be a "solved problem" more or less and I wouldn't know.

Martin

@d4l3k
Copy link
Contributor Author

d4l3k commented Aug 7, 2018

Yep, I was referring to printing the matches. There's another go PEG library which has a PrintSyntaxTree method which just prints each AST node's name and start/end.

https://github.com/pointlander/peg/blob/master/peg.peg.go#L292-L298

@breml
Copy link
Collaborator

breml commented Aug 8, 2018

OK, I understand.

What would be needed for this?

  • Introduce a new option for the parser to instruct it to output the matched rules / syntax tree, maybe allow to provide some configuration, e.g. what should be printed (rule name, start-end index, matched input / runes, name of the called Go function, etc.)
  • Is output of this to stdout enough or do we need to provide other options, like accepting a Writer interface to allow the user of the parser redirect the output to stdout/file/buffer/etc.
  • Define an output format for this: should it be primarely human-readable or are there other use cases, where it would be interesting to process this output also with tools (grep/awk) or even JSON).
  • Should the output format be configurable (template?)
  • Is it only about the action expressions of the matched rules or is it also about the other expressions?

From the example @d4l3k provided I understand that the prettyprint function in the end prints a (colored) line per "node" (matched rule) in the grammar (see https://github.com/pointlander/peg/blob/master/peg.peg.go#L234).

In https://github.com/d4l3k/wikigopher/blob/master/wikitext/debug.go only the name of the called Go function is printed.

I feel like we first have to outline, what exactly should be implemented. Then we can work towards a PR. @d4l3k would you be willing to work on such a PR?

@d4l3k
Copy link
Contributor Author

d4l3k commented Aug 8, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants