Skip to content

Latest commit

 

History

History
35 lines (28 loc) · 1.71 KB

README.md

File metadata and controls

35 lines (28 loc) · 1.71 KB

NONFUNCTIONAL - No minification happens yet!

This mostly tokenizes shell scripts, but doesn't quite do that right. After tokenization, some level of parsing needs to happen in order to properly change the contexts. From there, I believe the best thing to do is to send data to another function that will minify the files.

This bloats a single shell script to consume at least (untested) 50x the memory of just holding the shell script in memory. That's because it tokenizes and adds a lot of metadata to each token. For example, take the classic fork bomb.

:(){:|:;};:

After tokenization, there's an array that looks like this example. The actual tokens may change in the future.

tokens=(
    'WORD,SOURCE.@1:1 :'
    'EMPTY_LIST,SOURCE.ARG.@1:2 ()'
    'BRACE_OPEN,SOURCE.ARG.@1:4 {'
    'WORD,BLOCK.@1:5 :'
    'UNKNOWN,BLOCK.ARG.@1:6 |'
    'WORD,BLOCK.ARG.@1:7 :;'
    'BRACE_CLOSE,BLOCK.ARG.@1:9 }'
    'WORD,SOURCE.ARG.@1:10 ;:'
    'EOL,SOURCE.ARG.@1:12 '$'\n'
)

When split by the delimiters (the first comma, at symbol, colon, and space), you get the following.

Token Context and Flags Line Col Data
WORD SOURCE. 1 1 :
EMPTY_LIST SOURCE.ARG. 1 2 ()
BRACE_OPEN SOURCE.ARG. 1 4 {
WORD BLOCK. 1 5 :
UNKNOWN BLOCK.ARG. 1 6 \
WORD BLOCK.ARG. 1 7 :;
BRACE_CLOSE BLOCK.ARG. 1 9 }
WORD SOURCE.ARG. 1 10 ;:
EOL SOURCE.ARG. 1 12 $'\n'