This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Is handling of singular / plural forms ('sentence' and 'sentences') correct / consistent? #231
Labels
You can continue the conversation there. Go to discussion →
In https://derwen.ai/docs/ptr/sample/ in the "Scrubber" section it says
To me this implies that "sentence" and "sentences" should be "grouped" (lemmatized), but in my experiments and in the output shown, the singular and plural forms are listed as separate.
Is this correct or wrong behavior? If it is correct, maybe just the tutorial needs to make this clear?
With the bugfix I propose in #232 and the token list I used for scrubbing I get the results
but now assume that I would actually be getting only one line, for both "sentence" and "sentences", am I wrong?
The text was updated successfully, but these errors were encountered: