-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compilation #1
Compilation #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many tests seem to be failing, do we want to adjust them?
I did not finish reading the PR yet, but I thought I would leave some intermediate comments
which tests are you talking about? If I run |
7a13bbd
to
3de6167
Compare
We require two vm instructions 'CheckLookaround' and 'WriteLookaround' to be able to track the state of lookaround expressions at the current position in the haystack. Both instructions access a new 'lookaround' vector of booleans, which contains one entry per lookaround expression in the regex.
These changes implement the compilation of lookaround assertions from HIR to NFA. Subexpressions of lookaround assertions are patched to a top level reverse union. This is necessary so that the NFA will explore the innermost subexpression first and thereby make sure that all subexpression results are available when they need to be checked. I.e. any `WriteLookaround` state must be visited before any `CheckLookaround` state with the same index.
The machinery necessary to perform the parallel lookbehind checking should only be compiled in when there is actually a lookbehind expression in the regex. This restores compilation to the expected outputs for regexes without lookbehind expressions.
This makes it so we don't need to reset the lookaround state on each character advancement.
5b80341
to
2413c2e
Compare
/// look-around sub-expression with the given index evaluates to `positive` | ||
/// at the current position in the haystack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is outdated now that we use indices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, we are using indices to track the state of look-arounds but the semantics of the CheckLookAround
instruction is still to provide a conditional epsilon transition on the predicate as described. That is, the transition will be taken if the look-around sub-expression evaluates to positive
at the current position, but the way this is checked is no longer via direct boolean comparison but rather by first evaluating last_true_pos == current_pos
and only then compare the result to positive
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to the point about how this check is now done. But indeed the docs do not have to be this precise about it here
Rename certain enums to be consistent with rest of codebase.
No description provided.