Document Processing
The Nanonets Node.js SDK provides functionality for processing documents. This guide covers all available methods for document processing, fully aligned with the Nanonets API.
Setup
Minimum Node.js version required: 14.0.0
Install the Nanonets Node.js SDK using npm:
npm install nanonets
Document Schema
The document object returned by the API contains the following fields:
{
"document_id": "string",
"status": "string", // "success", "pending", "failure"
"uploaded_at": "string",
"metadata": "string | object",
"original_document_name": "string",
"raw_document_url": "string",
"verification_status": "string", // "success", "failed"
"verification_stage": "string",
"verification_message": "string",
"assigned_reviewers": ["string"],
"pages": [
{
"page_id": "string",
"page_number": 1,
"image_url": "string",
"data": {
"fields": {
"invoice_number": [
{
"field_data_id": "string",
"value": "string",
"confidence": 0.98,
"bbox": [100, 200, 300, 250],
"verification_status": "string",
"verification_message": "string",
"is_moderated": false
}
]
},
"tables": [
{
"table_id": "string",
"bbox": [100, 300, 800, 600],
"cells": [
{
"cell_id": "string",
"row": 0,
"col": 0,
"header": "item_description",
"text": "string",
"bbox": [100, 330, 300, 360],
"verification_status": "string",
"verification_message": "string",
"is_moderated": false
}
]
}
]
}
}
]
}
Upload Document
Uploads a document for processing. Supports both file and URL upload, with async and metadata options.
// Upload from file path
const result = await client.documents.upload("workflow_123", {
file: "/path/to/document.pdf",
async: false,
metadata: {
customer_id: "12345",
document_type: "invoice",
department: "finance"
}
});
// Upload from URL
const result2 = await client.documents.upload("workflow_123", {
url: "https://example.com/invoice.pdf",
async: false,
metadata: {
customer_id: "12345",
document_type: "invoice",
department: "finance"
}
});
Get Document
Retrieves the processing results for a specific document.
const document = await client.documents.get("workflow_123", "document_123");
List Documents
Retrieves a list of all documents in a specific workflow.
const documents = await client.documents.list("workflow_123", { page: 1, limit: 10 });
Delete Document
Removes a document from the workflow.
await client.documents.delete("workflow_123", "document_123");
Get Document Fields
Retrieves the extracted fields from a document.
const fields = await client.documents.getFields("workflow_123", "document_123");
Get Document Tables
Retrieves the extracted tables from a document.
const tables = await client.documents.getTables("workflow_123", "document_123");
Get Document Original File
Downloads the original document file.
const file = await client.documents.getOriginalFile("workflow_123", "document_123");
Error Handling & Common Scenarios
API error codes:
- 200 OK: Request successful
- 201 Created: Document uploaded successfully
- 400 Bad Request: Invalid request parameters or unsupported file type
- 401 Unauthorized: Invalid/missing API key
- 404 Not Found: Workflow or document not found
- 413 Payload Too Large: File size exceeds limit
- 500 Internal Server Error: Server-side error
Common error scenarios:
- File upload issues (unsupported type, too large, corrupted)
- Processing errors (timeout, unreadable content, failure)
- Field/table header issues (invalid/duplicate names)
const { NanonetsError, AuthenticationError, ValidationError } = require('nanonets');
try {
const result = await client.documents.upload("workflow_123", {...});
} catch (error) {
if (error instanceof AuthenticationError) {
console.error("Authentication failed:", error.message);
} else if (error instanceof ValidationError) {
console.error("Invalid input:", error.message);
} else if (error instanceof NanonetsError) {
console.error("An error occurred:", error.message);
}
}
Best Practices
- Use async for large files or batch processing
- Include relevant metadata for better tracking
- Validate file types before upload
- Check confidence scores before using extracted data
- Handle both sync and async responses appropriately
- Implement retry logic for failed processing
- Delete processed documents when no longer needed
- Monitor storage usage and implement retention policies
For more detailed information about specific features, see: