Queue a document for background processing. Returns a record_id to poll results via GET /api/v1/extract/results/{record_id}.
Recommended for large documents (>50 pages).
Documentation Index
Fetch the complete documentation index at: https://docstrange.nanonets.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
API key as Bearer token: Authorization: Bearer YOUR_API_KEY
Output format(s): markdown, html, json, csv. Comma-separate for multiple (e.g., markdown,json).
"markdown"
File to upload (PDF, Word, Excel, PowerPoint, images)
URL to download file from
""
Base64-encoded file content
""
Custom extraction instructions (e.g., Format dates as YYYY-MM-DD)
8000""
append: add to base prompt, replace: use only custom instructions
append, replace JSON extraction options. Values: hierarchy_output, table-of-contents, field list ["field1", "field2"], or JSON schema {...}
""
CSV extraction options (e.g., table)
""
Comma-separated metadata: bounding_boxes, confidence_score
""
When true, extracts data points from charts, graphs, and labeled technical diagrams as structured tables. Tables are inserted inline after each image in Markdown output.
true, false ""
When true, runs an agentic post-processing pipeline that refines the JSON extraction. Requires output_format=json. Use agentic_steps to pick which steps to run.
true, false ""
Comma-separated list of agentic steps to run. Requires agentic_mode=true. Available steps:
agentic_enhance — Reconciles conflicts that arise when same information is present at multiple places in large documents.validate_json_schema — Validates the extracted JSON against your json_options schema and corrects any violations."agentic_enhance,validate_json_schema"
Job queued
Job ID for retrieving results
completed, processing, failed Results by format (only requested formats populated)
Time in seconds
Size in bytes