-
-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Word List Substitution #718
Comments
This could surely be done, however I don't quite understand why we would do this. Can you elaborate? |
This reminds me of #593, which was designed to make it a little less trivial to detect that a binary was built with garble. I'm fine with those kind of changes in general, as long as they don't have downsides like noticeably bigger binaries. Right now, the names get replaced by hashes, and we have enough bits that collisions are extremely unlikely, and this allows us to not need to have book-keeping in terms of how we obfuscated each name. We simply hash again as needed. My only worry with this approach is that, with a word list, we would need to pick many words to have enough bits using the same mechanism. And since some words can be long, this could make names very long, and binaries noticeably bigger as well. Maybe this is OK if the word list is long enough and we aggressively abbreviate some of the longer words (without causing duplicates). We'd have to experiment a bit. |
We could always add the obfuscated name book-keeping as well, and to some degree we already record what names we did not obfuscate, which is the opposite. This would allow for shorter obfuscated names, but we would need to be very careful to assign names in a deterministic order. |
My attempt to implement this is stuck on the |
I seem to have got something usable. Here's an example of how "realistic" naming works before and after Names generated based on scrapped identifiers: https://github.com/pagran/go-identifiers-database |
You might find https://github.com/mvdan/corpus/blob/master/top-1000.tsv useful in terms of collecting more "top" modules. Although it only scrapes github right now. |
We already have two large PRs in flight. If you want us to work faster, sponsor us, particularly @pagran in this case :) |
In order to fight against entropy, it would be useful to have Garble combine N number of words from a provided word list and use that for string replacement instead of random characters. Could also provide a max string length and trim the string at that length.
The text was updated successfully, but these errors were encountered: