Skip to main content
POST
/
api
/
v1
/
extract
/
stream
Streaming Extraction
curl --request POST \
  --url https://extraction-api.nanonets.com/api/v1/extract/stream \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form output_format=markdown \
  --form file='@example-file'
"data: {\"type\": \"content\", \"data\": \"# Document Title\\n\"}\n\ndata: {\"type\": \"done\", \"record_id\": \"12345\", \"processing_time\": 2.5}\n\n"

Documentation Index

Fetch the complete documentation index at: https://docstrange.nanonets.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

API key as Bearer token: Authorization: Bearer YOUR_API_KEY

Body

multipart/form-data
output_format
string
required

Output format(s): markdown, html, json, csv. Comma-separate for multiple.

Example:

"markdown"

file
file

File to upload (PDF, Word, Excel, PowerPoint, images)

file_url
string<uri>

URL to download file from

Example:

""

file_base64
string

Base64-encoded file content

Example:

""

enable_streaming
boolean
default:true

Enable real-time streaming. If false, returns complete content via SSE batch mode.

custom_instructions
string

Custom extraction instructions

Maximum string length: 8000
Example:

""

prompt_mode
enum<string>
default:append

append: add to base prompt, replace: use only custom instructions

Available options:
append,
replace
json_options
string

JSON extraction options

Example:

""

csv_options
string

CSV extraction options

Example:

""

include_metadata
string

Comma-separated metadata: bounding_boxes, confidence_score

Example:

""

extract_charts
enum<string>

When true, extracts data points from charts, graphs, and labeled technical diagrams as structured tables. Tables are inserted inline after each image in Markdown output.

Available options:
true,
false
Example:

""

agentic_mode
enum<string>

When true, runs an agentic post-processing pipeline that refines the JSON extraction. Requires output_format=json. Use agentic_steps to pick which steps to run.

Available options:
true,
false
Example:

""

agentic_steps
string

Comma-separated list of agentic steps to run. Requires agentic_mode=true. Available steps:

  • agentic_enhance — Reconciles conflicts that arise when same information is present at multiple places in large documents.
  • validate_json_schema — Validates the extracted JSON against your json_options schema and corrects any violations.
Example:

"agentic_enhance,validate_json_schema"

Response

SSE stream of extraction events

Server-Sent Events stream with JSON payloads

Example:

"data: {\"type\": \"content\", \"data\": \"# Document Title\\n\"}\n\ndata: {\"type\": \"done\", \"record_id\": \"12345\", \"processing_time\": 2.5}\n\n"