Queue a document for background processing. Returns a record_id to poll results via GET /api/v1/extract/results/{record_id}.
Recommended for large documents (>50 pages).
API key as Bearer token: Authorization: Bearer YOUR_API_KEY
Output format(s): markdown, html, json, csv. Comma-separate for multiple (e.g., markdown,json).
"markdown"
File to upload (PDF, Word, Excel, PowerPoint, images)
URL to download file from
""
Base64-encoded file content
""
Custom extraction instructions (e.g., Format dates as YYYY-MM-DD)
8000""
append: add to base prompt, replace: use only custom instructions
append, replace JSON extraction options. Values: hierarchy_output, table-of-contents, field list ["field1", "field2"], or JSON schema {...}
""
CSV extraction options (e.g., table)
""
Comma-separated metadata: bounding_boxes, confidence_score
""
Job queued
Job ID for retrieving results
completed, processing, failed Results by format (only requested formats populated)
Time in seconds
Size in bytes