Introduction
Welcome to the Nanonets API documentation. This guide will help you integrate Nanonets into your applications.
Quick Links
- API Reference - Authentication
- API Reference - Workflow Management
- API Reference - Document Processing
- Error Handling
What is Nanonets?
Nanonets provides an AI-driven Intelligent Document Processing API that transforms unstructured documents into structured data. Our advanced OCR and document data extraction capabilities enable you to:
- Extract structured data from various document types (invoices, receipts, forms, etc.)
- Convert unstructured text into organized, machine-readable formats
- Process and analyze document content with high accuracy
- Automate document workflows and data entry tasks
Key Features
- Advanced OCR & Data Extraction: Extract text, fields, and tables from documents with high accuracy
- Unstructured to Structured Data: Transform raw document content into organized, structured formats
- Workflow Automation: Approve or reject extracted results and assign files for review
- External Integrations: Seamlessly import documents from various sources and export data to business applications
Getting Started
This guide will help you get started with the Nanonets API quickly.
Prerequisites
- A Nanonets account
- An API key (get it from http://app.nanonets.com/#/keys)
- Basic knowledge of REST APIs
- Your preferred programming language (Python, JavaScript, etc.)
Quick Start with the Nanonets SDK
1. Create Instant Learning Workflow
- Python
- Node.js
from nanonets import Nanonets
# Initialize client
client = Nanonets(api_key='your_api_key')
# Create instant learning workflow
workflow = client.workflows.create(
name="Custom Document Workflow",
description="Extract data from custom documents",
workflow_type="" # Empty string for instant learning workflow
)
const { Nanonets } = require('nanonets');
// Initialize client
const client = new Nanonets({
apiKey: 'your_api_key'
});
// Create instant learning workflow
const workflow = await client.workflows.create({
name: "Custom Document Workflow",
description: "Extract data from custom documents",
workflowType: "" // Empty string for instant learning workflow
});
2. Configure Fields and Tables to Extract
- Python
- Node.js
- cURL
import requests
from requests.auth import HTTPBasicAuth
API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id'] # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/fields"
payload = {
"fields": [
{"name": "invoice_number"},
{"name": "total_amount"},
{"name": "invoice_date"}
],
"table_headers": [
{"name": "item_description"},
{"name": "quantity"},
{"name": "unit_price"},
{"name": "total"}
]
}
response = requests.put(
url,
json=payload,
auth=HTTPBasicAuth(API_KEY, '')
)
print(response.json())
const axios = require('axios');
const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id; // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/fields`;
const payload = {
fields: [
{ name: "invoice_number" },
{ name: "total_amount" },
{ name: "invoice_date" }
],
table_headers: [
{ name: "item_description" },
{ name: "quantity" },
{ name: "unit_price" },
{ name: "total" }
]
};
axios.put(url, payload, {
auth: {
username: API_KEY,
password: ''
}
})
.then(response => {
console.log(response.data);
})
.catch(error => {
console.error(error);
});
curl -X PUT \
-u "YOUR_API_KEY:" \
-H "Content-Type: application/json" \
-d '{
"fields": [
{ "name": "invoice_number" },
{ "name": "total_amount" },
{ "name": "invoice_date" }
],
"table_headers": [
{ "name": "item_description" },
{ "name": "quantity" },
{ "name": "unit_price" },
{ "name": "total" }
]
}' \
https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/fields
3. Process Document
- Python
- Node.js
# Process a document
result = workflow.process_document(
file_path="invoice.pdf",
async_mode=True
)
# Get results
if result.status == "completed":
# Access extracted fields with error handling
try:
invoice_number = result.data['fields']['Invoice Number'][0]['value']
except (KeyError, IndexError):
invoice_number = None
print("Invoice Number not found in the document")
try:
total_amount = result.data['fields']['Total Amount'][0]['value']
except (KeyError, IndexError):
total_amount = None
print("Total Amount not found in the document")
try:
invoice_date = result.data['fields']['Invoice Date'][0]['value']
except (KeyError, IndexError):
invoice_date = None
print("Invoice Date not found in the document")
# Access extracted tables with error handling
try:
for table in result.data['tables']:
for cell in table['cells']:
print(f"Row {cell['row']}, Col {cell['col']}: {cell['text']}")
except (KeyError, AttributeError):
print("No tables found in the document")
// Process a document
const result = await workflow.processDocument({
filePath: "invoice.pdf",
asyncMode: true
});
// Get results
if (result.status === "completed") {
// Access extracted fields with error handling
let invoiceNumber, totalAmount, invoiceDate;
try {
invoiceNumber = result.data.fields["Invoice Number"][0].value;
} catch (error) {
invoiceNumber = null;
console.log("Invoice Number not found in the document");
}
try {
totalAmount = result.data.fields["Total Amount"][0].value;
} catch (error) {
totalAmount = null;
console.log("Total Amount not found in the document");
}
try {
invoiceDate = result.data.fields["Invoice Date"][0].value;
} catch (error) {
invoiceDate = null;
console.log("Invoice Date not found in the document");
}
// Access extracted tables with error handling
try {
for (const table of result.data.tables) {
for (const cell of table.cells) {
console.log(`Row ${cell.row}, Col ${cell.col}: ${cell.text}`);
}
}
} catch (error) {
console.log("No tables found in the document");
}
}
Quick Start with the REST API
1. Create Instant Learning Workflow
- Python
- Node.js
- cURL
import requests
from requests.auth import HTTPBasicAuth
API_KEY = 'YOUR_API_KEY'
url = "https://app.nanonets.com/api/v4/workflows"
# Create instant learning workflow (default)
payload = {
"description": "Extract data from custom documents",
"workflow_type": "" # Empty string for instant learning workflow
}
response = requests.post(
url,
json=payload,
auth=HTTPBasicAuth(API_KEY, '')
)
print(response.json())
const axios = require('axios');
const API_KEY = 'YOUR_API_KEY';
const url = "https://app.nanonets.com/api/v4/workflows";
// Create instant learning workflow (default)
const payload = {
description: "Extract data from custom documents",
workflow_type: "" // Empty string for instant learning workflow
};
axios.post(url, payload, {
auth: {
username: API_KEY,
password: ''
}
})
.then(response => {
console.log(response.data);
})
.catch(error => {
console.error(error);
});
curl -X POST \
-u "YOUR_API_KEY:" \
-H "Content-Type: application/json" \
-d '{
"description": "Extract data from custom documents",
"workflow_type": ""
}' \
https://app.nanonets.com/api/v4/workflows
2. Configure Fields and Tables to Extract
- Python
- Node.js
- cURL
import requests
from requests.auth import HTTPBasicAuth
API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id'] # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/fields"
payload = {
"fields": [
{"name": "invoice_number"},
{"name": "total_amount"},
{"name": "invoice_date"}
],
"table_headers": [
{"name": "item_description"},
{"name": "quantity"},
{"name": "unit_price"},
{"name": "total"}
]
}
response = requests.put(
url,
json=payload,
auth=HTTPBasicAuth(API_KEY, '')
)
print(response.json())
const axios = require('axios');
const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id; // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/fields`;
const payload = {
fields: [
{ name: "invoice_number" },
{ name: "total_amount" },
{ name: "invoice_date" }
],
table_headers: [
{ name: "item_description" },
{ name: "quantity" },
{ name: "unit_price" },
{ name: "total" }
]
};
axios.put(url, payload, {
auth: {
username: API_KEY,
password: ''
}
})
.then(response => {
console.log(response.data);
})
.catch(error => {
console.error(error);
});
curl -X PUT \
-u "YOUR_API_KEY:" \
-H "Content-Type: application/json" \
-d '{
"fields": [
{ "name": "invoice_number" },
{ "name": "total_amount" },
{ "name": "invoice_date" }
],
"table_headers": [
{ "name": "item_description" },
{ "name": "quantity" },
{ "name": "unit_price" },
{ "name": "total" }
]
}' \
https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/fields
3. Process Document
- Python
- Node.js
- cURL
import requests
from requests.auth import HTTPBasicAuth
API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id'] # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/documents"
# Process a document
files = {'file': open('invoice.pdf', 'rb')}
response = requests.post(url, files=files, auth=HTTPBasicAuth(API_KEY, ''))
result = response.json()
# Get results
if result['status'] == 'completed':
# Access extracted fields
invoice_number = result['data']['fields']['Invoice Number'][0]['value']
total_amount = result['data']['fields']['Total Amount'][0]['value']
invoice_date = result['data']['fields']['Invoice Date'][0]['value']
print(f"Invoice Number: {invoice_number}")
print(f"Total Amount: {total_amount}")
print(f"Invoice Date: {invoice_date}")
# Access extracted tables
for table in result['data']['tables']:
print(f"\nTable: {table['name']}")
for cell in table['cells']:
print(f"Row {cell['row']}, Col {cell['col']}: {cell['text']}")
const axios = require('axios');
const FormData = require('form-data');
const fs = require('fs');
const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id; // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/documents`;
// Process a document
const formData = new FormData();
formData.append('file', fs.createReadStream('invoice.pdf'));
axios.post(url, formData, {
auth: {
username: API_KEY,
password: ''
},
headers: {
...formData.getHeaders()
}
})
.then(response => {
const result = response.data;
if (result.status === 'completed') {
// Access extracted fields
const invoiceNumber = result.data.fields['Invoice Number'][0].value;
const totalAmount = result.data.fields['Total Amount'][0].value;
const invoiceDate = result.data.fields['Invoice Date'][0].value;
console.log(`Invoice Number: ${invoiceNumber}`);
console.log(`Total Amount: ${totalAmount}`);
console.log(`Invoice Date: ${invoiceDate}`);
// Access extracted tables
for (const table of result.data.tables) {
console.log(`\nTable: ${table.name}`);
for (const cell of table.cells) {
console.log(`Row ${cell.row}, Col ${cell.col}: ${cell.text}`);
}
}
}
})
.catch(error => {
console.error(error);
});
curl -X POST \
"https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/documents" \
-u "YOUR_API_KEY:" \
-F "file=@invoice.pdf"
Best Practices
Error Handling
- Always check response status codes
- Implement retry logic for rate limits
Security
- Store API keys securely
- Use environment variables
Performance
- Use async processing for large files
- Monitor API usage
Next Steps
- Learn about Authentication
- Explore API Endpoints