Quickstart

Follow this guide to make your first API call, explore streaming, and try async processing.

Prerequisites

An API key from docstrange.nanonets.com (sign in, then find the key in the top-right menu)
A document to extract (PDF, image, Word, Excel, or PowerPoint)

Make your first extraction

Send a document to the synchronous endpoint and get back structured Markdown:

curl -X POST "https://extraction-api.nanonets.com/api/v1/extract/sync" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F "output_format=markdown"

You should receive a JSON response with "success": true and your extracted content in result.markdown.content.

Try multiple output formats

Request Markdown and JSON in a single call:

curl -X POST "https://extraction-api.nanonets.com/api/v1/extract/sync" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@invoice.pdf" \
  -F "output_format=markdown,json"

The response includes both result.markdown and result.json.

Stream results in real-time

Use the streaming endpoint for real-time content delivery via Server-Sent Events:

Python

import requests
import json

with open("document.pdf", "rb") as f:
    response = requests.post(
        "https://extraction-api.nanonets.com/api/v1/extract/stream",
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        files={"file": f},
        data={"output_format": "markdown", "enable_streaming": "true"},
        stream=True
    )

for line in response.iter_lines():
    if line:
        line = line.decode("utf-8")
        if line.startswith("data: "):
            event = json.loads(line[6:])
            if event["type"] == "content":
                print(event["data"], end="", flush=True)
            elif event["type"] == "done":
                print(f"\n\nCompleted in {event['processing_time']:.2f}s")

Process large documents asynchronously

For documents over 5 pages, use async processing:

curl -X POST "https://extraction-api.nanonets.com/api/v1/extract/async" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@large-report.pdf" \
  -F "output_format=markdown"

Then poll for results using the record_id from the response:

curl -X GET "https://extraction-api.nanonets.com/api/v1/extract/results/12345" \
  -H "Authorization: Bearer YOUR_API_KEY"

The result returns "status": "processing" until the job completes, then switches to "status": "completed" with the full content.

What’s Next

Output Formats

Learn about all output options: Markdown, HTML, JSON schemas, CSV, and more.

Code Examples

Complete examples for every endpoint, including batch processing and React integration.

Python SDK

Type-safe Python client with async support and streaming.

TypeScript SDK

Type-safe TypeScript client for Node.js and browser.

Getting Started

Benchmarks

Resources

Deployment

Prerequisites

What’s Next

Output Formats

Code Examples

Python SDK

TypeScript SDK

Getting Started

Benchmarks

Resources

Deployment

Documentation Index

​Prerequisites

​What’s Next

Output Formats

Code Examples

Python SDK

TypeScript SDK

Prerequisites

What’s Next