Production-ready API service for document layout analysis, OCR, and semantic chunking.
Convert PDFs, PPTs, Word docs & images into RAG/LLM-ready chunks.
Layout Analysis | OCR + Bounding Boxes | Structured HTML and markdown | VLM Processing controls
Try it out!
·
Report Bug
·
Contact
·
Discord
- Table of Contents
- (Super) Quick Start
- Documentation
- LLM Configuration
- Self-Hosted Deployment Options
- Licensing
- Connect With Us
- Go to chunkr.ai
- Make an account and copy your API key
- Install our Python SDK:
pip install chunkr-ai
- Use the SDK to process your documents:
from chunkr_ai import Chunkr # Initialize with your API key from chunkr.ai chunkr = Chunkr(api_key="your_api_key") # Upload a document (URL or local file path) url = "https://chunkr-web.s3.us-east-1.amazonaws.com/landing_page/input/science.pdf" task = chunkr.upload(url) # Export results in various formats task.html(output_file="output.html") task.markdown(output_file="output.md") task.content(output_file="output.txt") task.json(output_file="output.json") # Clean up chunkr.close()
Visit our docs for more information and examples.
You can use any OpenAI API compatible endpoint by setting the following variables in your .env file:
LLM__KEY:
LLM__MODEL:
LLM__URL:
LLM__KEY=your_openai_api_key
LLM__MODEL=gpt-4o
LLM__URL=https://api.openai.com/v1/chat/completions
For getting a Google AI Studio API key, see here.
LLM__KEY=your_google_ai_studio_api_key
LLM__MODEL=gemini-2.0-flash-lite
LLM__URL=https://generativelanguage.googleapis.com/v1beta/openai/chat/completions
Check here for available models.
LLM__KEY=your_openrouter_api_key
LLM__MODEL=google/gemini-pro-1.5
LLM__URL=https://openrouter.ai/api/v1/chat/completions
You can use any OpenAI API compatible endpoint. To host your own LLM you can use VLLM or Ollama.
LLM__KEY=your_api_key
LLM__MODEL=model_name
LLM__URL=http://localhost:8000/v1
-
Prerequisites:
- Docker and Docker Compose
- NVIDIA Container Toolkit (for GPU support, optional)
-
Clone the repo:
git clone https://github.com/lumina-ai-inc/chunkr
cd chunkr
- Set up environment variables:
# Copy the example environment file
cp .env.example .env
# Configure your environment variables
# Required: LLM_KEY as your OpenAI API key
- Start the services:
With GPU:
docker compose up -d
- Access the services:
- Web UI:
http://localhost:5173
- API:
http://localhost:8000
- Web UI:
Important:
- Requires an NVIDIA CUDA GPU
- CPU-only deployment via
compose-cpu.yaml
is currently in development and not recommended for use
- Stop the services when done:
docker compose down
For production environments, we provide a Helm chart and detailed deployment instructions:
- See our detailed guide at
kube/README.md
- Includes configurations for high availability and scaling
For enterprise support and deployment assistance, contact us.
The core of this project is dual-licensed:
- GNU Affero General Public License v3.0 (AGPL-3.0)
- Commercial License
To use Chunkr without complying with the AGPL-3.0 license terms you can contact us or visit our website.
- 📧 Email: [email protected]
- 📅 Schedule a call: Book a 30-minute meeting
- 🌐 Visit our website: chunkr.ai