Introduction
Welcome to the Nanonets API documentation. This guide will help you integrate Nanonets into your applications.
Quick Links
- API Reference - Authentication
- API Reference - Workflow Management
- API Reference - Document Processing
- Error Handling
What is Nanonets?
Nanonets provides an AI-driven Intelligent Document Processing API that transforms unstructured documents into structured data. Our advanced OCR and document data extraction capabilities enable you to:
- Extract structured data from various document types (invoices, receipts, forms, etc.)
- Convert unstructured text into organized, machine-readable formats
- Process and analyze document content with high accuracy
- Automate document workflows and data entry tasks
Key Features
- Advanced OCR & Data Extraction: Extract text, fields, and tables from documents with high accuracy
- Unstructured to Structured Data: Transform raw document content into organized, structured formats
- Workflow Automation: Approve or reject extracted results and assign files for review
- External Integrations: Seamlessly import documents from various sources and export data to business applications
Getting Started
This guide will help you get started with the Nanonets API quickly.
Prerequisites
- A Nanonets account
- An API key (get it from http://app.nanonets.com/#/keys)
- Basic knowledge of REST APIs
- Your preferred programming language (Python, JavaScript, etc.)
Quick Start with the Nanonets SDK
1. Create Instant Learning Workflow
- Python
- Node.js
from nanonets import Nanonets
# Initialize client
client = Nanonets(api_key='your_api_key')
# Create instant learning workflow
workflow = client.workflows.create(
name="Custom Document Workflow",
description="Extract data from custom documents",
workflow_type="" # Empty string for instant learning workflow
)
const { Nanonets } = require('nanonets');
// Initialize client
const client = new Nanonets({
apiKey: 'your_api_key'
});
// Create instant learning workflow
const workflow = await client.workflows.create({
name: "Custom Document Workflow",
description: "Extract data from custom documents",
workflowType: "" // Empty string for instant learning workflow
});
2. Configure Fields and Tables to Extract
- Python
- Node.js
# Configure fields to extract
workflow.configure_fields([
{
"name": "Invoice Number",
"type": "text"
},
{
"name": "Total Amount",
"type": "number"
},
{
"name": "Invoice Date",
"type": "date"
}
])
# Configure table headers
workflow.configure_table_headers([
{
"name": "Item Description",
"type": "text"
},
{
"name": "Quantity",
"type": "number"
},
{
"name": "Unit Price",
"type": "number"
}
])
// Configure fields to extract
await workflow.configureFields([
{
name: "Invoice Number",
type: "text"
},
{
name: "Total Amount",
type: "number"
},
{
name: "Invoice Date",
type: "date"
}
]);
// Configure table headers
await workflow.configureTableHeaders([
{
name: "Item Description",
type: "text"
},
{
name: "Quantity",
type: "number"
},
{
name: "Unit Price",
type: "number"
}
]);
3. Process Document
- Python
- Node.js
# Process a document
result = workflow.process_document(
file_path="invoice.pdf",
async_mode=True
)
# Get results
if result.status == "completed":
# Access extracted fields with error handling
try:
invoice_number = result.data['fields']['Invoice Number'][0]['value']
except (KeyError, IndexError):
invoice_number = None
print("Invoice Number not found in the document")
try:
total_amount = result.data['fields']['Total Amount'][0]['value']
except (KeyError, IndexError):
total_amount = None
print("Total Amount not found in the document")
try:
invoice_date = result.data['fields']['Invoice Date'][0]['value']
except (KeyError, IndexError):
invoice_date = None
print("Invoice Date not found in the document")
# Access extracted tables with error handling
try:
for table in result.data['tables']:
for cell in table['cells']:
print(f"Row {cell['row']}, Col {cell['col']}: {cell['text']}")
except (KeyError, AttributeError):
print("No tables found in the document")
// Process a document
const result = await workflow.processDocument({
filePath: "invoice.pdf",
asyncMode: true
});
// Get results
if (result.status === "completed") {
// Access extracted fields with error handling
let invoiceNumber, totalAmount, invoiceDate;
try {
invoiceNumber = result.data.fields["Invoice Number"][0].value;
} catch (error) {
invoiceNumber = null;
console.log("Invoice Number not found in the document");
}
try {
totalAmount = result.data.fields["Total Amount"][0].value;
} catch (error) {
totalAmount = null;
console.log("Total Amount not found in the document");
}
try {
invoiceDate = result.data.fields["Invoice Date"][0].value;
} catch (error) {
invoiceDate = null;
console.log("Invoice Date not found in the document");
}
// Access extracted tables with error handling
try {
for (const table of result.data.tables) {
for (const cell of table.cells) {
console.log(`Row ${cell.row}, Col ${cell.col}: ${cell.text}`);
}
}
} catch (error) {
console.log("No tables found in the document");
}
}
Quick Start with the REST API
1. Create Instant Learning Workflow
- Python
- Node.js
- cURL
import requests
from requests.auth import HTTPBasicAuth
API_KEY = 'YOUR_API_KEY'
url = "https://app.nanonets.com/api/v4/workflows"
# Create instant learning workflow
payload = {
"name": "Custom Document Workflow",
"description": "Extract data from custom documents",
"workflow_type": "" # Empty string for instant learning workflow
}
response = requests.post(url, json=payload, auth=HTTPBasicAuth(API_KEY, ''))
workflow = response.json()
print(f"Created workflow with ID: {workflow['id']}")
const axios = require('axios');
const API_KEY = 'YOUR_API_KEY';
const url = "https://app.nanonets.com/api/v4/workflows";
// Create instant learning workflow
const payload = {
name: "Custom Document Workflow",
description: "Extract data from custom documents",
workflowType: "" // Empty string for instant learning workflow
};
axios.post(url, payload, {
auth: {
username: API_KEY,
password: ''
}
})
.then(response => {
const workflow = response.data;
console.log(`Created workflow with ID: ${workflow.id}`);
})
.catch(error => {
console.error(error);
});
curl -X POST \
"https://app.nanonets.com/api/v4/workflows" \
-u "YOUR_API_KEY:" \
-H "Content-Type: application/json" \
-d '{
"name": "Custom Document Workflow",
"description": "Extract data from custom documents",
"workflow_type": ""
}'
2. Configure Fields and Tables to Extract
- Python
- Node.js
- cURL
import requests
from requests.auth import HTTPBasicAuth
API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id'] # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/fields"
# Configure fields to extract
fields_payload = {
"fields": [
{
"name": "Invoice Number",
"type": "text"
},
{
"name": "Total Amount",
"type": "number"
},
{
"name": "Invoice Date",
"type": "date"
}
],
"tables": [
{
"name": "Line Items",
"headers": [
{
"name": "Item Description",
"type": "text"
},
{
"name": "Quantity",
"type": "number"
},
{
"name": "Unit Price",
"type": "number"
}
]
}
]
}
response = requests.put(url, json=fields_payload, auth=HTTPBasicAuth(API_KEY, ''))
print("Fields and tables configured successfully")
const axios = require('axios');
const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id; // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/fields`;
// Configure fields to extract
const fieldsPayload = {
fields: [
{
name: "Invoice Number",
type: "text"
},
{
name: "Total Amount",
type: "number"
},
{
name: "Invoice Date",
type: "date"
}
],
tables: [
{
name: "Line Items",
headers: [
{
name: "Item Description",
type: "text"
},
{
name: "Quantity",
type: "number"
},
{
name: "Unit Price",
type: "number"
}
]
}
]
};
axios.put(url, fieldsPayload, {
auth: {
username: API_KEY,
password: ''
}
})
.then(() => {
console.log("Fields and tables configured successfully");
})
.catch(error => {
console.error(error);
});
curl -X PUT \
"https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/fields" \
-u "YOUR_API_KEY:" \
-H "Content-Type: application/json" \
-d '{
"fields": [
{
"name": "Invoice Number",
"type": "text"
},
{
"name": "Total Amount",
"type": "number"
},
{
"name": "Invoice Date",
"type": "date"
}
],
"tables": [
{
"name": "Line Items",
"headers": [
{
"name": "Item Description",
"type": "text"
},
{
"name": "Quantity",
"type": "number"
},
{
"name": "Unit Price",
"type": "number"
}
]
}
]
}'
3. Process Document
- Python
- Node.js
- cURL
import requests
from requests.auth import HTTPBasicAuth
API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id'] # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/documents"
# Process a document
files = {'file': open('invoice.pdf', 'rb')}
response = requests.post(url, files=files, auth=HTTPBasicAuth(API_KEY, ''))
result = response.json()
# Get results
if result['status'] == 'completed':
# Access extracted fields
invoice_number = result['data']['fields']['Invoice Number'][0]['value']
total_amount = result['data']['fields']['Total Amount'][0]['value']
invoice_date = result['data']['fields']['Invoice Date'][0]['value']
print(f"Invoice Number: {invoice_number}")
print(f"Total Amount: {total_amount}")
print(f"Invoice Date: {invoice_date}")
# Access extracted tables
for table in result['data']['tables']:
print(f"\nTable: {table['name']}")
for cell in table['cells']:
print(f"Row {cell['row']}, Col {cell['col']}: {cell['text']}")
const axios = require('axios');
const FormData = require('form-data');
const fs = require('fs');
const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id; // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/documents`;
// Process a document
const formData = new FormData();
formData.append('file', fs.createReadStream('invoice.pdf'));
axios.post(url, formData, {
auth: {
username: API_KEY,
password: ''
},
headers: {
...formData.getHeaders()
}
})
.then(response => {
const result = response.data;
if (result.status === 'completed') {
// Access extracted fields
const invoiceNumber = result.data.fields['Invoice Number'][0].value;
const totalAmount = result.data.fields['Total Amount'][0].value;
const invoiceDate = result.data.fields['Invoice Date'][0].value;
console.log(`Invoice Number: ${invoiceNumber}`);
console.log(`Total Amount: ${totalAmount}`);
console.log(`Invoice Date: ${invoiceDate}`);
// Access extracted tables
for (const table of result.data.tables) {
console.log(`\nTable: ${table.name}`);
for (const cell of table.cells) {
console.log(`Row ${cell.row}, Col ${cell.col}: ${cell.text}`);
}
}
}
})
.catch(error => {
console.error(error);
});
curl -X POST \
"https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/documents" \
-u "YOUR_API_KEY:" \
-F "file=@invoice.pdf"
Best Practices
Error Handling
- Always check response status codes
- Implement retry logic for rate limits
Security
- Store API keys securely
- Use environment variables
Performance
- Use async processing for large files
- Monitor API usage
Next Steps
- Learn about Authentication
- Explore API Endpoints