January 13
API Playground UI
New interactive web playground for testing document extraction:- Output format selection - Choose from Markdown, JSON, CSV/Excel, or HTML output
- Schema Builder - Visual JSON schema editor with support for nested objects, arrays, and enums up to 10 levels deep
- Field List mode - Quick extraction with simple field name arrays
- Metadata options - Enable confidence scores and bounding boxes per field
December 15
Streaming Extraction Endpoint
New/v1/extract/stream endpoint for real-time extraction via Server-Sent Events (SSE):- Streaming mode - Content delivered in small chunks as it’s generated
- Batch mode - Content sent all at once when extraction completes
December 8
v1 Extraction API
New/v1/extract endpoints with a cleaner, more consistent interface:- Sync extraction (
/v1/extract/sync) - Process documents synchronously with immediate results - Async extraction (
/v1/extract/async) - Queue documents for background processing - Batch extraction (
/v1/extract/batch) - Process up to 50 files in a single request - Fetch result (
/v1/extract/results/<record_id>) - Fetch the results for a single record_id - Results endpoints (
/v1/extract/results) - List and retrieve extraction results with pagination
November 27
Multi-Page JSON Extraction with Confidence Scoring
Enhanced JSON extraction now processes multi-page documents and returns responses based on the best confidence score, improving accuracy for complex documents.Bounding Box Extraction API
New/extract-with-bounding-boxes endpoint that returns extracted data with precise coordinate information for each field, enabling document annotation and validation workflows.Response Dimensions
API responses now include dimension metadata (width/height) for processed documents, useful for coordinate calculations and rendering.November 21
Streaming & Partial Results
- New streaming extraction endpoint at
/v1/extract/stream - Partial results API to retrieve in-progress extractions
- Improved delimiter handling for chunked responses
Billing & Usage APIs
- Credit usage reporting integration with Stripe
- Subscription status tracking
- Document processing limits per plan
IP Blocking & Rate Limiting
Enhanced security with IP-based access control middleware and improved rate limiting mechanisms.November 14
On-Premise License APIs
New license management APIs for enterprise on-premise deployments, including activation and validation endpoints.Repetition Detection & Retry
Improved extraction reliability with automatic retry using nucleus sampling when repetition patterns are detected in model outputs.Excel & DOCX Processing
Fixed file processing for Excel spreadsheets and Word documents with improved error handling.October 17
OpenAI-Compatible Chat Completions API
Full OpenAI-compatible/v1/chat/completions endpoint supporting:- PDF and document uploads directly in requests
- All major file types (PDF, Excel, Word, images)
- Drop-in replacement for OpenAI SDK integrations
Hierarchy Extraction API
New API for extracting document hierarchies and structure, including parent-child relationships and table of contents with linked IDs.October 10
Custom Prompt Instructions
Support for custom prompt instructions in markdown format, allowing fine-tuned extraction behavior for specific use cases.Expanded File Type Support
Extended support for additional file formats in chat completions:- PDF documents
- Excel spreadsheets (.xlsx, .xls)
- Word documents (.docx)
- All major image formats
OpenAI SDK Upgrade
Updated to latest OpenAI SDK version for improved compatibility and performance.Coming Soon
Batch Processing
Process multiple documents in a single API call with consolidated results.
Webhooks
Real-time notifications for async extraction completion events.