Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up sentence generation #1

Open
edanaher opened this issue May 15, 2018 · 0 comments
Open

Clean up sentence generation #1

edanaher opened this issue May 15, 2018 · 0 comments

Comments

@edanaher
Copy link
Owner

This started out as a nice and clean:

  • Pick a pivot letter for each word based on what letters need practice
  • Pick letters before and after that letter based on ngram frequences biased somewhat by what letters need practice.

Adding punctuation added a bit of complexity:

  • Some punctuation should always be at the beginning or end of a word.
  • We now have two sets to choose from, and a parameter to weight punctuation to fudge this.

Numbers further complicate the pictures:

  • Some words should be just numbers
  • Some numbers should be in words
  • Some numbers should be in specific words (1st, 8th, etc.)

This is pretty ugly now. I think it should work to simply assign each symbol a weight (possibly including a class-weight, possibly including user-tunable per-symbol weights), generate the word, and then adjust it to make sure things like punctuation fit, and possibly accumulate numbers to go into a pure-number word later. More thought is required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant