-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request: sepBy variant that can backtrack at the final separator (only) #15
Comments
Have you seen my answer to the StackOverflow question? Do you still think that FParsec should have a specialized combinator for that purpose? |
Hadn't seen your SO answer yet; just upvoted it. I have an appointment I have to run to right now, so I don't have time to give your question the consideration it deserves. I'll give you an answer once I've had a moment to think about it. |
Having thought about it, the backtracking would probably be rather expensive, and the number of use cases are rather small. (Most languages don't allow you to follow a comma-separated list with a comma-plus-something-else, for example). So I'm inclined to say "No", or at least "Not yet", to this proposal, unless there are some real use cases that I haven't thought of. However, I'm also inclined to leave this issue open for a little while so that anyone who does need this feature can explain their use case. For example, I'll leave a comment on that StackOverflow question to ask the OP to come here and explain his real use case. If it's possible for him to rearrange his parsing grammar a little so that the separator |
For what it's worth, I believe have encountered this issue in practice while solving day 7 of the 2022 "advent of code". Background to the problem being solved: Example code: Using nested instances of sepBy that share a delimiter (newline in this case) is problematic, since the inner sepBy won't backtrack by default. (In this example, the outer sepBy is for a list of commands with output, and the inner sepBy is for individual lines of command output) |
This StackOverflow question is what led to this feature request. I don't know the actual use case, but the example use case in that question is a parser setup similar to the following:
This looks like it would successfully parse the string
"1,2,3,NNW"
, but it will fail. ThesepBy
function will consume the comma after3
, see that the next character is not a valid digit, and fail becausesepBy
does not allow a final separator not followed by a valid list item.In the case of this example, it would be possible to parse that string by using
sepEndBy
and changing thedirection
parser to a simplemanyChars dirChars
. But that would then pass when given the input"1,2,3NNW"
(note no comma before the direction), and in some parsing scenarios it's likely that the comma between the numbers and the direction would be required, so the"1,2,3NNW"
input should actually fail.Having looked through the
sepBy
code pretty thoroughly, I think the best way to implement this request (allowsepBy
to backtrack to the state just before the last separator if the last separator isn't followed by a valid list item) is to create a newsepBy
variant. The existingsepBy
implementation would require very little tweaking to implement this (instead of just checking the parser state before consuming a separator, actually save a backtrack point and then restore it if the separator isn't followed by a valid item), but the performance cost could be large (I don't know how much it costs to save a backtracking point that you're most likely going to throw away).So to implement this feature request might actually require forking the existing
Inline.SepBy
implementation, which might be more complex than it's worth. But it's worth looking into, at least.The text was updated successfully, but these errors were encountered: