Introduction

Welcome to the Nanonets API documentation. This guide will help you integrate Nanonets into your applications.

Quick Links

What is Nanonets?

Nanonets provides an AI-driven Intelligent Document Processing API that transforms unstructured documents into structured data. Our advanced OCR and document data extraction capabilities enable you to:

Extract structured data from various document types (invoices, receipts, forms, etc.)
Convert unstructured text into organized, machine-readable formats
Process and analyze document content with high accuracy
Automate document workflows and data entry tasks

Key Features

Advanced OCR & Data Extraction: Extract text, fields, and tables from documents with high accuracy
Unstructured to Structured Data: Transform raw document content into organized, structured formats
Workflow Automation: Approve or reject extracted results and assign files for review
External Integrations: Seamlessly import documents from various sources and export data to business applications

Getting Started

This guide will help you get started with the Nanonets API quickly.

Prerequisites

A Nanonets account
An API key (get it from http://app.nanonets.com/#/keys)
Basic knowledge of REST APIs
Your preferred programming language (Python, JavaScript, etc.)

Quick Start with the Nanonets SDK

1. Create Instant Learning Workflow

Python
Node.js

from nanonets import Nanonets

# Initialize client
client = Nanonets(api_key='your_api_key')

# Create instant learning workflow
workflow = client.workflows.create(
    name="Custom Document Workflow",
    description="Extract data from custom documents",
    workflow_type=""  # Empty string for instant learning workflow
)

const { Nanonets } = require('nanonets');

// Initialize client
const client = new Nanonets({
  apiKey: 'your_api_key'
});

// Create instant learning workflow
const workflow = await client.workflows.create({
  name: "Custom Document Workflow",
  description: "Extract data from custom documents",
  workflowType: ""  // Empty string for instant learning workflow
});

2. Configure Fields and Tables to Extract

Python
Node.js
cURL

import requests
from requests.auth import HTTPBasicAuth

API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id']  # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/fields"

payload = {
    "fields": [
        {"name": "invoice_number"},
        {"name": "total_amount"},
        {"name": "invoice_date"}
    ],
    "table_headers": [
        {"name": "item_description"},
        {"name": "quantity"},
        {"name": "unit_price"},
        {"name": "total"}
    ]
}

response = requests.put(
    url,
    json=payload,
    auth=HTTPBasicAuth(API_KEY, '')
)
print(response.json())

const axios = require('axios');

const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id;  // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/fields`;

const payload = {
  fields: [
    { name: "invoice_number" },
    { name: "total_amount" },
    { name: "invoice_date" }
  ],
  table_headers: [
    { name: "item_description" },
    { name: "quantity" },
    { name: "unit_price" },
    { name: "total" }
  ]
};

axios.put(url, payload, {
  auth: {
    username: API_KEY,
    password: ''
  }
})
.then(response => {
  console.log(response.data);
})
.catch(error => {
  console.error(error);
});

curl -X PUT \
  -u "YOUR_API_KEY:" \
  -H "Content-Type: application/json" \
  -d '{
    "fields": [
      { "name": "invoice_number" },
      { "name": "total_amount" },
      { "name": "invoice_date" }
    ],
    "table_headers": [
      { "name": "item_description" },
      { "name": "quantity" },
      { "name": "unit_price" },
      { "name": "total" }
    ]
  }' \
  https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/fields

3. Process Document

Python
Node.js

# Process a document
result = workflow.process_document(
    file_path="invoice.pdf",
    async_mode=True
)

# Get results
if result.status == "completed":
    # Access extracted fields with error handling
    try:
        invoice_number = result.data['fields']['Invoice Number'][0]['value']
    except (KeyError, IndexError):
        invoice_number = None
        print("Invoice Number not found in the document")
    
    try:
        total_amount = result.data['fields']['Total Amount'][0]['value']
    except (KeyError, IndexError):
        total_amount = None
        print("Total Amount not found in the document")
    
    try:
        invoice_date = result.data['fields']['Invoice Date'][0]['value']
    except (KeyError, IndexError):
        invoice_date = None
        print("Invoice Date not found in the document")
    
    # Access extracted tables with error handling
    try:
        for table in result.data['tables']:
            for cell in table['cells']:
                print(f"Row {cell['row']}, Col {cell['col']}: {cell['text']}")
    except (KeyError, AttributeError):
        print("No tables found in the document")

// Process a document
const result = await workflow.processDocument({
    filePath: "invoice.pdf",
    asyncMode: true
});

// Get results
if (result.status === "completed") {
    // Access extracted fields with error handling
    let invoiceNumber, totalAmount, invoiceDate;
    
    try {
        invoiceNumber = result.data.fields["Invoice Number"][0].value;
    } catch (error) {
        invoiceNumber = null;
        console.log("Invoice Number not found in the document");
    }
    
    try {
        totalAmount = result.data.fields["Total Amount"][0].value;
    } catch (error) {
        totalAmount = null;
        console.log("Total Amount not found in the document");
    }
    
    try {
        invoiceDate = result.data.fields["Invoice Date"][0].value;
    } catch (error) {
        invoiceDate = null;
        console.log("Invoice Date not found in the document");
    }
    
    // Access extracted tables with error handling
    try {
        for (const table of result.data.tables) {
            for (const cell of table.cells) {
                console.log(`Row ${cell.row}, Col ${cell.col}: ${cell.text}`);
            }
        }
    } catch (error) {
        console.log("No tables found in the document");
    }
}

Quick Start with the REST API

1. Create Instant Learning Workflow

Python
Node.js
cURL

import requests
from requests.auth import HTTPBasicAuth

API_KEY = 'YOUR_API_KEY'
url = "https://app.nanonets.com/api/v4/workflows"

# Create instant learning workflow (default)
payload = {
    "description": "Extract data from custom documents",
    "workflow_type": ""  # Empty string for instant learning workflow
}

response = requests.post(
    url,
    json=payload,
    auth=HTTPBasicAuth(API_KEY, '')
)
print(response.json())

const axios = require('axios');

const API_KEY = 'YOUR_API_KEY';
const url = "https://app.nanonets.com/api/v4/workflows";

// Create instant learning workflow (default)
const payload = {
  description: "Extract data from custom documents",
  workflow_type: ""  // Empty string for instant learning workflow
};

axios.post(url, payload, {
  auth: {
    username: API_KEY,
    password: ''
  }
})
.then(response => {
  console.log(response.data);
})
.catch(error => {
  console.error(error);
});

curl -X POST \
  -u "YOUR_API_KEY:" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Extract data from custom documents",
    "workflow_type": ""
  }' \
  https://app.nanonets.com/api/v4/workflows

2. Configure Fields and Tables to Extract

Python
Node.js
cURL

import requests
from requests.auth import HTTPBasicAuth

API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id']  # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/fields"

payload = {
    "fields": [
        {"name": "invoice_number"},
        {"name": "total_amount"},
        {"name": "invoice_date"}
    ],
    "table_headers": [
        {"name": "item_description"},
        {"name": "quantity"},
        {"name": "unit_price"},
        {"name": "total"}
    ]
}

response = requests.put(
    url,
    json=payload,
    auth=HTTPBasicAuth(API_KEY, '')
)
print(response.json())

const axios = require('axios');

const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id;  // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/fields`;

const payload = {
  fields: [
    { name: "invoice_number" },
    { name: "total_amount" },
    { name: "invoice_date" }
  ],
  table_headers: [
    { name: "item_description" },
    { name: "quantity" },
    { name: "unit_price" },
    { name: "total" }
  ]
};

axios.put(url, payload, {
  auth: {
    username: API_KEY,
    password: ''
  }
})
.then(response => {
  console.log(response.data);
})
.catch(error => {
  console.error(error);
});

curl -X PUT \
  -u "YOUR_API_KEY:" \
  -H "Content-Type: application/json" \
  -d '{
    "fields": [
      { "name": "invoice_number" },
      { "name": "total_amount" },
      { "name": "invoice_date" }
    ],
    "table_headers": [
      { "name": "item_description" },
      { "name": "quantity" },
      { "name": "unit_price" },
      { "name": "total" }
    ]
  }' \
  https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/fields

3. Process Document

Python
Node.js
cURL

import requests
from requests.auth import HTTPBasicAuth

API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id']  # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/documents"

# Process a document
files = {'file': open('invoice.pdf', 'rb')}
response = requests.post(url, files=files, auth=HTTPBasicAuth(API_KEY, ''))
result = response.json()

# Get results
if result['status'] == 'completed':
    # Access extracted fields
    invoice_number = result['data']['fields']['Invoice Number'][0]['value']
    total_amount = result['data']['fields']['Total Amount'][0]['value']
    invoice_date = result['data']['fields']['Invoice Date'][0]['value']
    print(f"Invoice Number: {invoice_number}")
    print(f"Total Amount: {total_amount}")
    print(f"Invoice Date: {invoice_date}")
    
    # Access extracted tables
    for table in result['data']['tables']:
        print(f"\nTable: {table['name']}")
        for cell in table['cells']:
            print(f"Row {cell['row']}, Col {cell['col']}: {cell['text']}")

const axios = require('axios');
const FormData = require('form-data');
const fs = require('fs');

const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id;  // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/documents`;

// Process a document
const formData = new FormData();
formData.append('file', fs.createReadStream('invoice.pdf'));

axios.post(url, formData, {
    auth: {
        username: API_KEY,
        password: ''
    },
    headers: {
        ...formData.getHeaders()
    }
})
.then(response => {
    const result = response.data;
    
    if (result.status === 'completed') {
        // Access extracted fields
        const invoiceNumber = result.data.fields['Invoice Number'][0].value;
        const totalAmount = result.data.fields['Total Amount'][0].value;
        const invoiceDate = result.data.fields['Invoice Date'][0].value;
        console.log(`Invoice Number: ${invoiceNumber}`);
        console.log(`Total Amount: ${totalAmount}`);
        console.log(`Invoice Date: ${invoiceDate}`);
        
        // Access extracted tables
        for (const table of result.data.tables) {
            console.log(`\nTable: ${table.name}`);
            for (const cell of table.cells) {
                console.log(`Row ${cell.row}, Col ${cell.col}: ${cell.text}`);
            }
        }
    }
})
.catch(error => {
    console.error(error);
});

curl -X POST \
  "https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/documents" \
  -u "YOUR_API_KEY:" \
  -F "file=@invoice.pdf"

Best Practices

Error Handling
- Always check response status codes
- Implement retry logic for rate limits
Security
- Store API keys securely
- Use environment variables
Performance
- Use async processing for large files
- Monitor API usage

Next Steps

Learn about Authentication
Explore API Endpoints

Introduction

Quick Links​

What is Nanonets?​

Key Features​

Getting Started

Prerequisites​

Quick Start with the Nanonets SDK​

1. Create Instant Learning Workflow​

2. Configure Fields and Tables to Extract​

3. Process Document​

Quick Start with the REST API​

1. Create Instant Learning Workflow​

2. Configure Fields and Tables to Extract​

3. Process Document​

Best Practices​

Next Steps​

Quick Links

What is Nanonets?

Key Features

Prerequisites

Quick Start with the Nanonets SDK

1. Create Instant Learning Workflow

2. Configure Fields and Tables to Extract

3. Process Document

Quick Start with the REST API

1. Create Instant Learning Workflow

2. Configure Fields and Tables to Extract

3. Process Document

Best Practices

Next Steps