You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This came to my mind when preparing the presentation for Amsterdam.JS:
Currently there is a special type="dot" for things like /./. Per see there is nothing wrong with this, but the type feels very similar to type="characterClassEscape ". How do you feel to merge the type characterClassEscape and dot? Maybe into specialCharacterClass?
Or, alternative idea: similar to how different types got merged into type=value, merge dot, characterClassEscape and the existing characterClass into characterClass and add a new kind entry? I like this, as it not only gets away with the type dot, but also with the type characterClassEscape, which sounds similar to characterClass, but is still completly different although similar. Like:
{type: "characterClass",kind: "range",body: [{type: "characterClassRange", ...}]}{type: "characterClass",kind: "singleChar",char: "d"// The body is the not needed here// body: [ ] }
This looks interesting to me, but I dislike the inconsistency by using body in one case and char in the other one to encode the "meaning" of the characterClass. In the case of value, all the different kinds have a codePoint entry. A possible way to achieve a similar feeling of consistency here could be to store on the body of the type: "characterClass in the case of the kind: "singleChar" the actual ranges that are matched. E.g. in the case of /\d/:
Looks nice, but encoding /\s/ this way will result in a very large body :/ Here are the two functions used in RegExp.JS to test for a /\s/ string:
functionisWhiteSpace(ch){return(ch===32)||// space(ch===9)||// tab(ch===0xB)||(ch===0xC)||(ch===0xA0)||(ch>=0x1680&&'\u1680\u180E\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200A\u202F\u205F\u3000\uFEFF'.indexOf(String.fromCharCode(ch))>0);}// 7.3 Line TerminatorsfunctionisLineTerminator(ch){return(ch===10)||(ch===13)||(ch===0x2028)||(ch===0x2029);}
Personally, I am not sure if the consistency is worth the larger AST output here.
So, maybe go with specialCharacterClass and characterClass? Any thoughts? Or do you think merging dot into a different type is not worth the efford and this issue should be closed right away ;)?
The text was updated successfully, but these errors were encountered:
This came to my mind when preparing the presentation for Amsterdam.JS:
Currently there is a special
type="dot"
for things like/./
. Per see there is nothing wrong with this, but the type feels very similar totype="characterClassEscape "
. How do you feel to merge the typecharacterClassEscape
anddot
? Maybe intospecialCharacterClass
?Or, alternative idea: similar to how different types got merged into
type=value
, mergedot
,characterClassEscape
and the existingcharacterClass
intocharacterClass
and add a newkind
entry? I like this, as it not only gets away with the typedot
, but also with the typecharacterClassEscape
, which sounds similar tocharacterClass
, but is still completly different although similar. Like:This looks interesting to me, but I dislike the inconsistency by using
body
in one case andchar
in the other one to encode the "meaning" of the characterClass. In the case ofvalue
, all the different kinds have acodePoint
entry. A possible way to achieve a similar feeling of consistency here could be to store on thebody
of thetype: "characterClass
in the case of thekind: "singleChar"
the actual ranges that are matched. E.g. in the case of/\d/
:Looks nice, but encoding
/\s/
this way will result in a very large body :/ Here are the two functions used in RegExp.JS to test for a/\s/
string:Personally, I am not sure if the consistency is worth the larger AST output here.
So, maybe go with
specialCharacterClass
andcharacterClass
? Any thoughts? Or do you think mergingdot
into a differenttype
is not worth the efford and this issue should be closed right away ;)?The text was updated successfully, but these errors were encountered: