-
Notifications
You must be signed in to change notification settings - Fork 355
How to get the best from OCR and text features
Matthias Balmer edited this page Jan 15, 2020
·
8 revisions
This is for version 2.0.x and 2.1.x
to be written ;-)
Just some notes for measures when default doesn't provide good results (@RaiMan needs to add some additional explanations :-)):
- Try legacy engine (OEM 0) together with tessdata from here
- Use tessdata_best models instead of tessdata_fast (causes massive performance degradation). Or just use tessdata which use integerized (faster) versions of tessdata_best without sacrificing too much accuracy.
- Use an appropriate PSM. Especially 7/8/9 if you want to detect a single line/word/character. Use convenience methods
TextRecognizer.asLine()
,TextRecognizer.asWord()
orTextRecognizer.asChar()
to get preconfiguredTextRecognizer
instances for this use cases. - If text is very small (<= 8 px) or very big (> 30px) give the OCR engine a hint using
setFontSize()
orsetUppercaseXHeight()