Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report statement errors #3559

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

phr34k
Copy link

@phr34k phr34k commented Mar 27, 2024

Initial changes to improve error reporting in combination with preprocessor directives. The intent of this changeset is the modify glslang in a way it reports location where the statement origionally begin, so that any syntax errors can report that location primarilly and/or report the type of statement.

All unit tests passed locally

This changeset modifies the glslang in the following ways:

  • modified the standalone glslang to include --report-statements feature toggle to enable functionallity optionally
  • modified the function signature of parseHelper::parserError to add in the location
  • modified the grammar to use bison custom syntax error reporter
  • modified the parserContext to keep track of some extra meta data w.r.t. statement and scope boundaries
  • modified the grammar to track the statement/scope boundaries either trough bison "subroutines" or mid-action rules

Example

#line 200 
layout( .... ) fragColor1
#line 500 
layout( .... ) fragColor2:

See #3544

@CLAassistant
Copy link

CLAassistant commented Mar 27, 2024

CLA assistant check
All committers have signed the CLA.

@arcady-lunarg
Copy link
Contributor

This looks interesting, but it's going to take a little while for me to give this a proper review.

@phr34k
Copy link
Author

phr34k commented Mar 29, 2024

Sounds good, surely there are some corner cases that can have been easy to miss, but I hope it's a small step in the right direction!

@phr34k
Copy link
Author

phr34k commented Aug 14, 2024

@arcady-lunarg did you ever get a chance to review this?

@arcady-lunarg
Copy link
Contributor

I am not sure I like the idea of mark_start/mark_end grammar rules. Maybe we need to update glslang's location type to be a span instead, i.e. having bot a start line/column and an end line/columns? I need to read bison's documentation a bit more as well, and I think the thing I would find the most helpful here is a before/after example of what effect this change has on some real use case.

@phr34k
Copy link
Author

phr34k commented Aug 21, 2024

I am not sure I like the idea of mark_start/mark_end grammar rules. Maybe we need to update glslang's location type to be a span instead, i.e. having bot a start line/column and an end line/columns? I need to read bison's documentation a bit more as well, and I think the thing I would find the most helpful here is a before/after example of what effect this change has on some real use case.

Sure, a second pair of eyes would be great. I mean, people working with more expierence in bison have a cleaner solution that I just didn't know about, so definitely open to ideas.

Here's what I can tell you, what I remember on top of my head:

I did have a play around with %define api.location.type {LocationType} to define a custom location type, and bison's own built-in span location type. I believe one of the problems I had with that was that bison parser stack can reduce at abritary points and it could have already reduced half of the 'entire' statement so whatever was still on parser stack or from the last group never matched the correct locations.

With regards to mark_start/mark_end the biggest problem I ran into were bison imposed restrictions w.r.t. mid rule actions. Most of the grammer have parts where multiple rules are used in small variations i.e. RETURN SEMICOLON | RETURN expression SEMICOLON. One of the problems I had is that bison encounters such an action, according various resources bison commit to that one rule and stops evaluating other rules that could be a better fit.

So if you have RETURN { ... } SEMICOLON | RETURN { ... } expression SEMICOLON despite being identical mid rule actions it commits to the first rule, by seeing a return symbol, and will not evaluate the second rule anymore - throwing errors it's expecting semicolon when expr is used. One of the recommended workarounds around this problem, by placing the mid-rule action in it's own symbol, that's how I ended up making mark_start/mark_end.

The first approach to me felt like I was running into a dead end given that I couldn't have any control over the parser stack so I figured the next best approach was marking statement boundaries manually, and use the look-a-head token to mark the location.


For an real-world use case example, let's agree that shader complexity can outgrow where they are kept in a single file (see for example amd's fiddelity fx cacao). Basically there might be multiple shaders, often they do have shared function definitions that are reused.

In one of the shaders you could have a definition like this, but just forgetting an semi-colon or a curly brace:

attributes.glsl.inc:

#IFDEF CONDITION_1
layout( .... ) fragColor1
#ENDIF

#IFDEF CONDITION_2
void pbr() {
#ENDIF

material-a.glsl:

#define CONDITION_1
#include "attributes.glsl.inc" 
layout( .... ) fragColor2:
void main() {  ....  }

material-b.glsl:

#define CONDITION_2
#include "attributes.glsl.inc" 
layout( .... ) fragColor2:
void main() {  ....  }

So now you have these two materials a and b, when an grammar violation occurs bison reports you an message along the lines of epected symbol semicolon but received layout and then references the location of layout. However to understand where the problem really comes from, you have to understand how functions.glsl.inc got evaluated, and that's where the problem is.

Sometimes you don't really know, by looking at the shader or there might be so many includes it's obfuscated from the reader, and so even if you know the source is functions.glsl.inc the question is 'where in functions.glsl.inc'. In the change set that I prepared, I marked the statement boundaries, so instead of reporting the error on layout it reports the last know location of where the statement begin, so it'll tell you 'attributes.glsl.inc line 1' or 'attributes.glsl.inc line 5' which drastically helps you narrow down your search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants