Is there some scripts to extract the raw text? #9

SefaZeng · 2020-09-12T03:37:29Z

Or do you ever extract the sentences before? I only find xml file there.
And thank you for make such a great project.

nevmenandr · 2020-09-12T11:15:10Z

Depends on what do you mean by "extract raw text". Extract from what?

SefaZeng · 2020-09-13T03:39:35Z

Depends on what do you mean by "extract raw text". Extract from what?

I mean how to extract the thai contents from the xml files which is downloaded from http://web-corpora.net/ThaiCorpus/search/

nevmenandr · 2020-09-25T19:29:49Z

If you mean the text collection, these xml files were converted into a specific corpus format that allows indexing and searching. Converter is here.

Provide feedback