API Reference

Complete documentation for the DocuProx API including endpoints, authentication, and code examples to help you integrate document extraction into your applications.

API Overview

The DocuProx Process API is the core document processing service that extracts structured data from images using predefined templates. It supports both reference-based and reference-free processing modes for various document types.

Core Endpoints

POST /v1/process

Process a document synchronously and get extracted data immediately in a single request.

PUT /v1/process

Upload and process a document using binary data stream with template ID as query parameter.

POST /v1/process-job

Submit documents for async processing. Supports images, PDFs, and ZIP batch uploads.

POST /v1/process-agent

Use AI agents to extract custom data from documents using your own prompts.

GET /v1/job-status

Check the current processing status of an async job using its job ID.

POST /v1/job-results

Retrieve the extracted results from a completed job in JSON or CSV format.

These endpoints enable document processing with various modes including synchronous, asynchronous, and AI-powered agent extraction.

Processing Modes

With Reference Mode

Ideal for standard document formats with consistent layouts. Uses a reference image from your template to guide extraction through label-value pair identification.

VALUE Identifies target data using actual values from reference document
RECTANGLE Defines bounded regions for precise data extraction
PROMPT Custom prompt instructions for flexible field extraction
STATIC Returns a predefined static value as default, or uses the value provided in the API payload

Without Reference Mode

Perfect for variable document layouts and multi-page PDFs. Processes documents directly without requiring a reference image, offering flexible extraction capabilities.

FLEXIBLE Adapts to varying document structures and layouts
MULTI-PAGE Supports complex PDF documents with multiple pages
STATIC Returns a predefined static value as default, or uses the value provided in the API payload

Agent Mode

Designed for AI agents to agentically process documents. This mode enables autonomous document processing with custom prompts, allowing AI systems to dynamically define extraction fields via the /v1/process-agent endpoint.

AGENTIC Built for AI agents to autonomously process documents
PROMPT Define custom extraction prompts per field
DYNAMIC Configure document type and instructions at runtime

Supported Formats

Images

  • • JPEG
  • • PNG
  • • GIF, BMP, TIFF

Documents

  • • PDF (single page)
  • • PDF (multi-page)

Input Methods

  • • Base64 encoded
  • • File uploads

Rate Limiting

The API includes configurable rate limiting with a 60-second window.

  • • Default limits apply per API key
  • • Contact support for enterprise limits
  • • Rate limit headers included in responses

Process API

The /v1/process endpoint is the core synchronous service for document processing. It accepts images and returns extracted structured data based on your templates. Response format is determined by your template configuration.

Endpoint Details

Method

POST

URL

/v1/process

Content Types

  • • application/json
  • • multipart/form-data

Required Parameters

These parameters are sent in the request body using either JSON or form data format:

Parameter Type Description
actual_image string/file Base64-encoded image or file upload
template_id string UUID of the template to use
static_values string (optional) JSON string of static key-value pairs to include in the response (e.g., { "company_name": "Acme Corp" })

Option 1: JSON Request Example

{
  "actual_image": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD...",
  "template_id": "123e4567-e89b-12d3-a456-426614174000",
  "static_values": "{ \"company_name\": \"Acme Corp\" }"
}

Option 2: Form Data Request

  • actual_image: Image file upload
  • template_id: Template UUID as form field
  • static_values: JSON string as form field (optional)

Authentication Headers

Option Header Type Description
Option 1 x-auth string Authentication token required for API access
Option 2 Authorization string Bearer token format: Bearer <API_KEY>

Template Requirements

Before processing, ensure your template meets these requirements:

Required Fields

  • • Template status: active
  • • Active current version
  • • Valid document_type

Optional Fields

  • • reference_image (for reference mode)
  • • edited_json (extraction structure)
  • • custom_instructions

Process Job API

The /v1/process-job endpoint is an asynchronous document processing service. It immediately returns a job ID, allowing you to check the status and retrieve results later. Supports batch processing via ZIP files containing multiple documents.

Endpoint Details

Method

POST

URL

/v1/process-job

Content Types

  • • application/json
  • • multipart/form-data

Authentication Headers

Option Header Type Description
Option 1 x-auth string Authentication token required for API access
Option 2 Authorization string Bearer token format: Bearer <API_KEY>

Request Parameters

Parameter Type Description
actual_image string/file Base64-encoded image, file upload, or ZIP file containing multiple documents
template_id string UUID of the template to use
static_values string (optional) JSON string of static key-value pairs

Supported File Formats

Images

  • • JPEG, PNG
  • • GIF, BMP, TIFF

Documents

  • • PDF (single/multi-page)

Batch Processing

  • • ZIP (multiple files)
  • Files must be in root, not in folders

Request Format Examples

Option 1: JSON Request Example

{
  "actual_image": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD...",
  "template_id": "123e4567-e89b-12d3-a456-426614174000",
  "static_values": "{ \"company_name\": \"Acme Corp\" }"
}

Option 2: Form Data Request

  • actual_image: Image file upload
  • template_id: Template UUID as form field
  • static_values: JSON string as form field (optional)

Response

{
  "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "template_id": "154bb430-c9d9-4154-9377-a3b0466fd836",
  "static_values": { "company_name": "Acme Corp" }
}

cURL Example

curl --location 'https://api.docuprox.com/v1/process-job' \
  --header 'x-auth: YOUR_API_KEY' \
  --form 'actual_image=@"/path/to/document.jpg"' \
  --form 'template_id="154bb430-c9d9-4154-9377-a3b0466fd836"' \
  --form 'static_values="{ \"company_name\": \"Acme Corp\" }"'

Alternative: Binary File Upload (PUT Method)

Upload documents directly as binary data stream. This method is ideal for direct file uploads without base64 encoding or form data.

Endpoint Details

Method

PUT

URL

/v1/process?t_id={template_id}

Content Type

  • • application/octet-stream

Authentication Headers

Option Header Type Description
Option 1 x-auth string Authentication token required for API access
Option 2 Authorization string Bearer token format: Bearer <API_KEY>

Query Parameters

Parameter Type Required Description
t_id string Yes UUID of the template to use for processing

cURL Example (Binary Upload)

curl --location --globoff --request PUT \
  'https://api.docuprox.com/v1/process?t_id=YOUR_TEMPLATE_ID' \
  --header 'x-auth: YOUR_API_KEY' \
  --header 'Content-Type: application/octet-stream' \
  --data-binary '@/path/to/document.pdf'

Process Agent API

The /v1/process-agent endpoint provides advanced document processing with custom prompts and instructions. It allows you to define extraction fields dynamically using a JSON payload.

Endpoint Details

Method

POST

URL

/v1/process-agent

Content Types

  • • application/json
  • • multipart/form-data

Authentication Headers

Option Header Type Description
Option 1 x-auth string Authentication token required for API access
Option 2 Authorization string Bearer token format: Bearer <API_KEY>

Request Parameters

Parameter Type Description
actual_image string/file Base64-encoded image or file upload
template_id string UUID of the template to use
payload string (JSON) JSON string containing processing configuration

Payload Structure

Field Type Description
document_type string Type of document being processed (e.g., "invoice", "receipt", "passport", "contract")
custom_instructions string (optional) Custom instructions for processing
prompt_json object JSON object containing the prompt configuration with field definitions
static_values string (optional) JSON string of static key-value pairs to include in processing

Payload Example

{
  "document_type": "passport",
  "custom_instructions": "",
  "prompt_json": {
    "passport": "from given document extract the passport number"
  },
  "static_values": "{ \"company_name\": \"Acme Corp\" }"
}

Request Format Examples

Option 1: JSON Request Example

{
  "actual_image": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD...",
  "template_id": "123e4567-e89b-12d3-a456-426614174000",
  "payload": {
    "document_type": "passport",
    "custom_instructions": "",
    "prompt_json": {
      "passport": "from given document extract the passport number"
    },
    "static_values": "{ \"company_name\": \"Acme Corp\" }"
  }
}

Option 2: Form Data Request

  • actual_image: Image file upload
  • template_id: Template UUID as form field
  • payload: JSON string containing processing configuration

cURL Example

curl --location 'https://api.docuprox.com/v1/process-agent' \
  --header 'x-auth: YOUR_API_KEY' \
  --form 'actual_image=@"/path/to/document.jpg"' \
  --form 'template_id="154bb430-c9d9-4154-9377-a3b0466fd836"' \
  --form 'payload="{
    \"document_type\": \"passport\",
    \"custom_instructions\": \"\",
    \"prompt_json\": {
      \"passport\": \"from given document extract the passport number\"
    },
    \"static_values\": \"{ \\\"company_name\\\": \\\"Acme Corp\\\" }\"
  }"'

Job Status API

The /v1/job-status endpoint allows you to check the processing status of an asynchronous job created via the Process Job API.

Endpoint Details

Method

GET

URL

/v1/job-status

Content Types

  • • application/json
  • • multipart/form-data

Authentication Headers

Option Header Type Description
Option 1 x-auth string Authentication token required for API access
Option 2 Authorization string Bearer token format: Bearer <API_KEY>

Query Parameters

Parameter Type Description
job_id string UUID of the job to check status for

cURL Example

curl --location \
  'https://api.docuprox.com/v1/job-status?job_id=JOB_ID' \
  --header 'x-auth: YOUR_API_KEY'

Status Values

The job status response will contain one of the following status values:

Status Description
NEW Job has been created and is queued for processing
UNZIP FILE Extracting files from uploaded ZIP archive
UNZIP FILE SUCCESS ZIP archive extraction completed successfully
UNZIP FILE FAILED ZIP archive extraction failed
PROCESS IMAGE Processing document images for data extraction
PROCESS IMAGE SUCCESS Image processing completed successfully
PROCESS IMAGE FAILED Image processing failed
SUCCESS Job completed successfully, results are ready
FAILED Job processing failed

Job Results API

The /v1/job-results endpoint retrieves the processed results for a completed job. Results can be returned in JSON or CSV format.

⏱️ 24-Hour Retention: Job results are stored for 24 hours after processing completion. Make sure to retrieve your results within this window. After 24 hours, results are automatically deleted and cannot be recovered.

Endpoint Details

Method

POST

URL

/v1/job-results

Content Types

  • • application/json
  • • multipart/form-data

Authentication Headers

Option Header Type Description
Option 1 x-auth string Authentication token required for API access
Option 2 Authorization string Bearer token format: Bearer <API_KEY>

Request Body

Parameter Type Description
job_id string UUID of the completed job
result_format string (optional) Output format: "json" (default) or "csv"

Request Format Examples

Option 1: JSON Request Example

{
  "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "result_format": "json"
}

Option 2: Form Data Request

  • job_id: Job UUID as form field
  • result_format: Output format as form field (optional)

Response Formats

JSON Format

Returns a JSON array containing the extracted data fields based on your template configuration.

CSV Format

Returns a downloadable CSV file with the extracted data in tabular format.

cURL Examples

Get JSON Results

curl --location 'https://api.docuprox.com/v1/job-results' \
  --header 'x-auth: YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "result_format": "json"
  }'

Get CSV Results

curl --location 'https://api.docuprox.com/v1/job-results' \
  --header 'x-auth: YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "result_format": "csv"
  }'

Error Responses

400 Missing or invalid job_id, or unsupported result_format
401 Invalid or missing API key
404 Job not found

Code Examples

Complete code examples for integrating the Process API in popular programming languages.

JavaScript / Node.js

const response = await fetch('/v1/process', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-auth': 'dp-123-your-api-key'
  },
  body: JSON.stringify({
    actual_image: 'data:image/jpeg;base64,...',
    template_id: '123e4567-e89b-12d3',
    static_values: "{ \"company_name\": \"Acme Corp\" }"
  })
});
const result = await response.json();
// Process result

Python

import requests
import base64
# Read and encode image
with open('document.jpg', 'rb') as f:
    image_data = base64.b64encode(f.read()).decode()
response = requests.post('/v1/process', 
  headers={
    'Content-Type': 'application/json',
    'x-auth': 'dp-123-your-api-key'
  },
  json={
    'actual_image': f'data:image/jpeg;base64,{image_data}',
    'template_id': '123e4567-e89b-12d3',
    'static_values': "{ \"company_name\": \"Acme Corp\" }"
  }
)
result = response.json()
print(result)

cURL Examples

JSON Request

curl -X POST https://api.docuprox.com/v1/process \
  -H "Content-Type: application/json" \
  -H "x-auth: dp-123-your-api-key" \
  -d '{
    "actual_image": "data:image/jpeg;base64,/9j/4AAQ...",
    "template_id": "123e4567-e89b-12d3-a456-426614174000",
    "static_values": "{ \"company_name\": \"Acme Corp\" }"
  }'

Form Data Request

curl -X POST https://api.docuprox.com/v1/process \
  -H "x-auth: dp-123-your-api-key" \
  -F "actual_image=@/path/to/document.jpg" \
  -F "template_id=123e4567-e89b-12d3-a456-426614174000" \
  -F "static_values={ \"company_name\": \"Acme Corp\" }"

Binary File Upload (PUT Request)

curl --location --globoff --request PUT \
  'https://api.docuprox.com/v1/process?t_id=YOUR_TEMPLATE_ID' \
  --header 'x-auth: dp-123-your-api-key' \
  --header 'Content-Type: application/octet-stream' \
  --data-binary '@/path/to/document.pdf'

Best Practices

  • Error Handling: Always check response status
  • Retry Logic: Implement exponential backoff
  • Rate Limiting: Respect API limits
  • Image Optimization: Compress images appropriately
  • Template Testing: Test thoroughly before production
  • Monitoring: Track API usage and performance