Benchmarking PDF libraries
-
Updated
Oct 31, 2023 - Python
Benchmarking PDF libraries
pdfgui_tools is a user interface tool developed in Qt and Python that integrates with poppler-utils and PyPDF2 for PDF document management. It's a simple and user-friendly tool that includes various utilities.
A simple pdftotext conversion tool for Windows 8.1/10/11 and FEDORA/UBUNTU/DEBIAN/ARCH based linux distros using poppler-utils and Google's tesseract-ocr.
Usage of stylometry and machine learning in computer forensics - real tools used in 2019 by the polish police. Everything in/for polish language.
Create a searchable pdf from a scanned PDF
A Jekyll plugin to generate thumbnails for your PDF files
Matrix Representation reformats images as RDF using natural ⨯ natural coordinates as a Media-Signature-Record / Structured-Data-Description. It is a positive, productive, and pragmatic introduction to semantic-web programming.
Extracted data from pdf files of resumes written in English. Used libraries: spacy, pdf2image, easyocr, poppler-utils.
Here, extracted information from sample random webscraped passports from both pdf and jpg file extensions
Add a description, image, and links to the poppler-utils topic page so that developers can more easily learn about it.
To associate your repository with the poppler-utils topic, visit your repo's landing page and select "manage topics."