Asynchronous Extraction

curl --request POST \
  --url https://extraction-api.nanonets.com/api/v1/extract/async \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form output_format=markdown \
  --form file='@example-file'

{
  "success": true,
  "message": "<string>",
  "record_id": "<string>",
  "status": "completed",
  "result": {
    "markdown": {
      "content": "<string>",
      "metadata": {
        "bounding_boxes": {},
        "confidence_score": {}
      }
    },
    "html": {
      "content": "<string>",
      "metadata": {
        "bounding_boxes": {},
        "confidence_score": {}
      }
    },
    "json": {
      "content": "<string>",
      "metadata": {
        "bounding_boxes": {},
        "confidence_score": {}
      }
    },
    "csv": {
      "content": "<string>",
      "metadata": {
        "bounding_boxes": {},
        "confidence_score": {}
      }
    }
  },
  "processing_time": 123,
  "filename": "<string>",
  "output_format": "<string>",
  "file_size": 123,
  "pages_processed": 123,
  "created_at": "2023-11-07T05:31:56Z"
}

POST

api

extract

async

Asynchronous Extraction

curl --request POST \
  --url https://extraction-api.nanonets.com/api/v1/extract/async \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form output_format=markdown \
  --form file='@example-file'

{
  "success": true,
  "message": "<string>",
  "record_id": "<string>",
  "status": "completed",
  "result": {
    "markdown": {
      "content": "<string>",
      "metadata": {
        "bounding_boxes": {},
        "confidence_score": {}
      }
    },
    "html": {
      "content": "<string>",
      "metadata": {
        "bounding_boxes": {},
        "confidence_score": {}
      }
    },
    "json": {
      "content": "<string>",
      "metadata": {
        "bounding_boxes": {},
        "confidence_score": {}
      }
    },
    "csv": {
      "content": "<string>",
      "metadata": {
        "bounding_boxes": {},
        "confidence_score": {}
      }
    }
  },
  "processing_time": 123,
  "filename": "<string>",
  "output_format": "<string>",
  "file_size": 123,
  "pages_processed": 123,
  "created_at": "2023-11-07T05:31:56Z"
}

Authorizations

Authorization

string

header

required

API key as Bearer token: Authorization: Bearer YOUR_API_KEY

Body

multipart/form-data

output_format

string

required

Output format(s): markdown, html, json, csv. Comma-separate for multiple (e.g., markdown,json).

Example:

"markdown"

file

File to upload (PDF, Word, Excel, PowerPoint, images)

file_url

string<uri>

URL to download file from

Example:

""

file_base64

string

Base64-encoded file content

Example:

""

custom_instructions

string

Custom extraction instructions (e.g., Format dates as YYYY-MM-DD)

Maximum string length: 8000

Example:

""

prompt_mode

enum<string>

default:append

append: add to base prompt, replace: use only custom instructions

Available options:

append,

replace

json_options

string

JSON extraction options. Values: hierarchy_output, table-of-contents, field list ["field1", "field2"], or JSON schema {...}

Example:

""

csv_options

string

CSV extraction options (e.g., table)

Example:

""

include_metadata

string

Comma-separated metadata: bounding_boxes, confidence_score

Example:

""

Response

Job queued

success

boolean

required

message

string

required

record_id

string

required

Job ID for retrieving results

status

enum<string>

required

Available options:

completed,

processing,

failed

result

object

Results by format (only requested formats populated)

Show child attributes

processing_time

number

Time in seconds

filename

string

output_format

string

file_size

integer

Size in bytes

pages_processed

integer

created_at

string<date-time>

Synchronous Extraction Streaming Extraction

⌘I

Document Extraction

Document Classification

Results

Asynchronous Extraction

Authorizations

Body

Response