-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating and moving the grammar to a separate file #2
base: master
Are you sure you want to change the base?
Changes from all commits
d291d11
5bf09b7
987a660
d5d3cac
da2b1b0
8698eb1
8b4be8a
7a82416
d75585e
aaed9c2
c4e7b70
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
expr = ws ( terminal / non-terminal ) ws | ||
root-expr = non-test / not / comparison | ||
non-test = subexpr / index / flatten / wildcard-index / filter / | ||
identifier / current-node / raw-string / literal / | ||
root-multi-list / multi-hash / function / group / slice | ||
group = begin-group expr end-group | ||
|
||
; Insignificant whitespace is allowed around expr nodes and around the | ||
; following structural tokens. | ||
begin-array = ws %x5B ws ; [ left square bracket | ||
begin-object = ws %x7B ws ; { left curly bracket | ||
end-array = ws %x5D ws ; ] right square bracket | ||
end-object = ws %x7D ws ; } right curly bracket | ||
begin-group = ws %x28 ws ; ( left parenthesis | ||
end-group = ws %x29 ws ; ) right parenthesis | ||
name-separator = ws %x3A ws ; : colon | ||
value-separator = ws %x2C ws ; , comma | ||
decimal-point = ws %x2E ws ; . | ||
pipe-separator = ws "|" ws | ||
or-separator = ws "||" ws | ||
and-separator = ws "&&" ws | ||
lt = ws "<" ws | ||
lte = ws "<=" ws | ||
gt = ws ">" ws | ||
gte = ws ">=" ws | ||
eq = ws "==" ws | ||
ne = ws "!=" ws | ||
expref-token = "&" ws | ||
not-token = "!" ws | ||
begin-filter = "[?" ws | ||
flatten = ws "[]" ws | ||
|
||
; Insignificant whitespace | ||
ws = *(%x20 / ; Space | ||
%x09 / ; Horizontal tab | ||
%x0A / ; Line feed or New line | ||
%x0D) ; Carriage return | ||
|
||
; "&&" binds more tightly than "||" | ||
; "||" binds more tightly than "|". | ||
terminal = pipe / or / and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just curious, why is this called terminal? I'm thinking of terminals as the leaf nodes in the parsed tree, i.e it can't be replaced with anything. But everything on the RHS can be broken down into more nodes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it isn't a good name. I named it this because it owns the entire side. Do you have a suggestion? |
||
non-terminal = root-expr / wildcard-values | ||
pipe = expr pipe-separator ( non-terminal / or / and ) | ||
or = ( non-terminal / and ) or-separator ( non-terminal / or / and ) | ||
and = non-terminal and-separator ( non-terminal / and ) | ||
|
||
; subexpr handles most that descend into nodes. | ||
subexpr = object-subexpr / array-subexpr | ||
object-subexpr = object-subexpr-lhs decimal-point object-subexpr-rhs | ||
object-subexpr-lhs = subexpr / index / flatten / wildcard-index / | ||
filter / identifier / current-node / literal / | ||
multi-hash / function / group / slice / wildcard-values | ||
object-subexpr-rhs = identifier / multi-list / multi-hash / function / | ||
wildcard-values | ||
array-subexpr = array-subexpr-lhs array-subexpr-rhs | ||
array-subexpr-lhs = subexpr / index / flatten / wildcard-index / filter / | ||
identifier / current-node / literal / root-multi-list / | ||
function / group / slice / wildcard-values / literal | ||
array-subexpr-rhs = index / slice / wildcard-index / flatten / filter | ||
|
||
; Array related rules | ||
index = begin-array number end-array | ||
slice = begin-array [number] name-separator | ||
[number] [name-separator [number] ] | ||
end-array | ||
wildcard-index = begin-array "*" end-array | ||
filter = begin-filter filter-condition end-array | ||
filter-condition = not / non-test / terminal / comparison | ||
|
||
not = not-token ( non-test / not ) | ||
comparison = non-terminal comparator ( non-test / not / wildcard-values ) | ||
comparator = lt / lte / gt / gte / eq / ne | ||
|
||
root-multi-list = begin-array | ||
( root-expr / multiple-values / terminal ) | ||
end-array | ||
multi-list = begin-array ( expr / multiple-values ) end-array | ||
multiple-values = expr 1*( value-separator expr ) | ||
|
||
multi-hash = begin-object ( keyval *( value-separator keyval ) ) end-object | ||
keyval = identifier name-separator expr | ||
|
||
function = unquoted-string arg-list | ||
arg-list = begin-group [ arg *( value-separator arg ) ] end-group | ||
arg = expr / expref | ||
|
||
expref = expref-token expr | ||
wildcard-values = "*" | ||
current-node = "@" | ||
|
||
number = ["-"] int | ||
|
||
; Strings, identifiers, and characters. | ||
raw-string = "'" *raw-string-char "'" | ||
raw-string-char = (%x20-26 / %x28-5B / %x5D-10FFFF) / raw-string-escape | ||
raw-string-escape = escape ["'"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just a heads up, I synced with this latest version that includes raw string literals and abnfstress caught this. The |
||
|
||
identifier = unquoted-string / quoted-string | ||
unquoted-string = ( ALPHA / "_" ) *( DIGIT / ALPHA / "_" ) | ||
quoted-string = DQUOTE *char DQUOTE | ||
char = unescaped-char / escaped-char | ||
unescaped-char = %x20-21 / %x23-5B / %x5D-10FFFF | ||
escape = %x5C ; "\" | ||
escaped-char = escape ( | ||
%x22 / ; " quotation mark U+0022 | ||
%x5C / ; \ reverse solidus U+005C | ||
%x2F / ; / solidus U+002F | ||
%x62 / ; b backspace U+0008 | ||
%x66 / ; f form feed U+000C | ||
%x6E / ; n line feed U+000A | ||
%x72 / ; r carriage return U+000D | ||
%x74 / ; t tab U+0009 | ||
%x75 4HEXDIG ) ; uXXXX U+XXXX | ||
|
||
; Literal rules (e.g., "`[]`") | ||
literal = "`" json-value "`" | ||
literal-char = unescaped-literal / escaped-literal | ||
unescaped-literal = %x20-5f / %x61-10FFFF ; Any character except "`" | ||
escaped-literal = escaped-char / ( escape "`" ) | ||
|
||
; JSON related grammar (for literal and string values) | ||
json-value = false / null / true / object / array | ||
json-value =/ json-number / json-string | ||
json-string = DQUOTE *literal-char DQUOTE | ||
false = %x66.61.6c.73.65 ; false | ||
null = %x6e.75.6c.6c ; null | ||
true = %x74.72.75.65 ; true | ||
object = begin-object [ member *( value-separator member ) ] end-object | ||
member = json-string name-separator json-value | ||
array = begin-array [ json-value *( value-separator json-value ) ] end-array | ||
json-number = ["-"] int [frac] [exp] | ||
digit1-9 = %x31-39 ; 1-9 | ||
int = "0" / ( digit1-9 *DIGIT ) | ||
exp = "e" [ "-" / "+" ] 1*DIGIT | ||
frac = decimal-point 1*DIGIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, what's a
non-test
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to break the grammar into smaller reusable parts, but has the side effect of an awkward node. These nodes can be used in the filter-condition, not and comparator rules without ambiguity. Maybe there's a better name or perhaps duplication is better.