Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting Stemmer for unlisted languages #139

Open
deeplearning101 opened this issue Sep 9, 2021 · 0 comments
Open

Setting Stemmer for unlisted languages #139

deeplearning101 opened this issue Sep 9, 2021 · 0 comments

Comments

@deeplearning101
Copy link

Hello,
I'm interested in using opensemanticsearch to index documents in Norwegian.
I see that Norwegian is not listed in setup http://[yourserver]/search-apps/setup/ in the Document Language section.

However, opensemanticsearch integrates SOLR and TIKA versions that support Norwegian and many other languages which are not covered by the opensemanticsearch officially supported languages.

Is it possible to manually set the configuration files to enable at least stemming (or other grammar-related features) for languages that are supported by SOLR but not listed in opensemanticsearch settings?

My need is just to search for PDFs and I have NO need to use all of the other language dependent features (e.g. named entity recognition, OCR, etc).

I think my request may be of general public interest since it would allow to extend opensemanticsearch users to people focused on unlisted languages in the official webpage.

Thank you in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant