Doctag miss content within <formula> tags #1008
-
Formula Tags now look like this -
But earlier Equations in latex format used to come. Can someone help in figuring this out? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hello, We've updated our system to include a new Formula Model, which converts images of formulas into well-formatted LaTeX representations. This model is disabled by default. To enable it, you need to modify the pipeline options as follows: from docling.datamodel.pipeline_options import PdfPipelineOptions
pipeline_options = PdfPipelineOptions()
pipeline_options.generate_page_images = True
pipeline_options.do_formula_enrichment = True With the Formula Model turned off, we no longer include any formula data in the exports, unlike before where we provided a highly inaccurate representation. The previous output came straight out of the PDF parser and often resulted in significant losses, such as missing fractions, special characters, and exponents. If you believe there's a use case for keeping the old, albeit lossy, representation rather than providing no formula data at all, please provide us with specific examples. This will help us understand the necessity and potentially adjust our current implementation. |
Beta Was this translation helpful? Give feedback.
Hello,
We've updated our system to include a new Formula Model, which converts images of formulas into well-formatted LaTeX representations. This model is disabled by default. To enable it, you need to modify the pipeline options as follows:
With the Formula Model turned off, we no longer include any formula data in the exports, unlike before where we provided a highly inaccurate representation. The previous output came straight out of the PDF parser and often resulted in significant losses, su…