Overview
Thedocstrange TypeScript SDK provides a type-safe, ergonomic interface for the Nanonets Document Extraction API. It includes full TypeScript types, native async/await, streaming, and automatic pagination.
Key Features
Type Safety
Full TypeScript types for all requests and responses with IntelliSense support.
Async Support
All methods return Promises with native async/await support.
Streaming
Consume SSE streams with
for await...of and typed event objects.File Upload
Upload files from ReadStreams, Buffers, Blobs, or File objects.
Error Handling
Typed exceptions with status codes, messages, and request IDs.
Pagination
Automatic pagination helpers for list endpoints.
Installation
Requires Node.js 18+
Authentication
The SDK reads your API key from theDOCSTRANGE_API_KEY environment variable by default.
Get your API key from the top right menu on docstrange.nanonets.com.
Core Methods
Synchronous Extraction
Extract content from a document and get results immediately. Best for files with 5 pages or less.| Parameter | Type | Required | Description |
|---|---|---|---|
file | Uploadable | * | File to upload (PDF, Word, Excel, PowerPoint, images). Provide exactly one of file, file_url, or file_base64. |
file_url | string | * | URL to download the file from. |
file_base64 | string | * | Base64-encoded file content. |
output_format | string | Yes | Output format(s): markdown, html, json, csv. Comma-separate for multiple (e.g., "markdown,json"). |
custom_instructions | string | No | Custom extraction instructions (max 8,000 chars). E.g., "Format dates as YYYY-MM-DD". |
prompt_mode | string | No | "append" (default) adds to base prompt, "replace" uses only your custom instructions. |
json_options | string | No | JSON extraction mode. Values: "hierarchy_output", "table-of-contents", field list '["field1", "field2"]', or JSON schema '{...}'. |
csv_options | string | No | CSV extraction options. E.g., "table". |
include_metadata | string | No | Comma-separated metadata types: bounding_boxes, bounding_boxes_word, confidence_score. |
Provide exactly one file input:
file, file_url, or file_base64. The file parameter accepts fs.createReadStream(), Buffer, Blob, or File objects.Asynchronous Extraction
Queue a document for background processing. Returns arecord_id to poll results. Recommended for large documents (>5 pages).
Streaming Extraction
Stream extraction results in real-time via Server-Sent Events.| Parameter | Type | Required | Description |
|---|---|---|---|
enable_streaming | boolean | No | true (default) for real-time incremental chunks. false for batch mode (complete content in a single SSE event). |
| Event Type | Description |
|---|---|
content | Incremental content chunk (streaming mode) |
complete | Full content at once (batch mode, when enable_streaming is false) |
done | Final event with record_id and processing_time |
error | Error information |
async_queued | Large files automatically queued for async processing |
Batch Extraction
Process multiple documents in a single request (max 50 files). All files share the same extraction options.| Parameter | Type | Required | Description |
|---|---|---|---|
files | Uploadable[] | Yes | List of files to process (max 50). |
output_format | string | Yes | Output format(s) applied to all files. |
custom_instructions | string | No | Custom extraction instructions (max 8,000 chars). |
prompt_mode | string | No | "append" (default) or "replace". |
json_options | string | No | JSON extraction mode. |
csv_options | string | No | CSV extraction options. |
include_metadata | string | No | Comma-separated metadata types. |
Document Classification
Classify a document into predefined categories. Each page is classified individually with a category, confidence score (0-100), and reasoning.| Parameter | Type | Required | Description |
|---|---|---|---|
file | Uploadable | Yes | File to classify (PDF, PNG, JPG, JPEG, TIFF, BMP, WebP). |
categories | string | Yes | JSON array of category objects: [{"name": "Category", "description": "Optional description"}]. |
Batch Classification
Classify multiple documents at once (max 50 files).| Parameter | Type | Required | Description |
|---|---|---|---|
files | Uploadable[] | Yes | Files to classify (max 50). |
categories | string | Yes | JSON array of category objects (max 50 categories). |
Retrieve Results
Get the status and result of a previous extraction by record ID.| Parameter | Type | Required | Description |
|---|---|---|---|
record_id | string | Yes | Extraction job ID (numeric string returned by the API). |
include_content | boolean | No | Include full extracted content (default: true). Set to false to retrieve only status and metadata. |
List Results
List all extraction results for the authenticated user (paginated).| Parameter | Type | Required | Description |
|---|---|---|---|
page | number | No | Page number (default: 1, minimum: 1). |
page_size | number | No | Results per page (default: 20, range: 1-100). |
sort_by | string | No | Sort field. One of: created_at (default), updated_at, original_filename, file_size, processing_status. |
sort_order | string | No | Sort direction: "desc" (default) or "asc". |
Error Handling
The SDK raises typed exceptions for API errors.| Exception | Description |
|---|---|
APIConnectionError | Network connectivity issues |
APITimeoutError | Request exceeded timeout |
APIStatusError | API returned an error status code |
AuthenticationError | Invalid or missing API key (401) |
PermissionDeniedError | Insufficient permissions (403) |
NotFoundError | Resource not found (404) |
RateLimitError | Too many requests (429) |
InternalServerError | Server error (500+) |
Configuration
Custom Base URL
For on-premise deployments, point the client to your own instance:Timeouts
Configure request timeouts (in milliseconds):Retries
The SDK automatically retries failed requests with exponential backoff. Configure the maximum number of retries:Next Steps
API Reference
Explore the complete API documentation with interactive examples.
Examples
See full code examples for every endpoint and output format.