DocuProx Core Features
Master the essential features of DocuProx: create powerful templates for document processing and integrate seamlessly with your existing business systems and workflows.
Templates
A DocuProx Template is a user-defined blueprint that teaches the DocuProx AI how to recognize, locate, and extract specific pieces of information from a particular type of document.
Key aspects of a DocuProx Template:
Folders
Organize and categorize your templates for better management. Create folders to group templates by document type, department, or any custom classification system that suits your workflow.
Reference Image Mode
Choose between two extraction approaches based on your document characteristics:
With Reference Image
Use this mode when you have a standard document format and want to guide the AI with a reference image. In this mode, you can define:
Without Reference Image
Perfect for variable document layouts and multi-page PDFs. Processes documents directly without requiring a reference image, offering flexible extraction capabilities.
Agent Mode
Designed for AI agents to agentically process documents. This mode enables autonomous document processing with custom prompts, allowing AI systems to dynamically define extraction fields via the /v1/process-agent endpoint.
Custom Instructions (Optional)
For specialized documents that the AI hasn't been extensively trained on, use this field to describe each aspect of your document in natural language. This helps the AI better understand the document structure and content.
JSON Structure Builder
A dynamic JSON builder that allows you to define the exact structure of your desired response format. You can:
- • Add elements and nodes (objects) to create complex JSON structures
- • Define target keys where extracted values will be populated at runtime when your system calls the backend API
- • Configure VALUE, RECTANGLE, PROMPT, and STATIC-based extraction methods within the JSON structure for precise data mapping
Test
After configuring your template, use the test feature to validate that your template returns the desired data in the correct format. You'll need to generate an API key to use this testing functionality, then upload sample documents to verify extraction accuracy before deploying to production.
Supported Formats
Images
- • JPEG
- • PNG
- • GIF, BMP, TIFF
Documents
- • PDF (single page)
- • PDF (multi-page)
- • ZIP (batch uploads)
Input Methods
- • Base64 encoded
- • File uploads
Integrations
DocuProx integrates seamlessly with your existing business systems and workflows through our robust API-first architecture.
Common Integration Scenarios
CRM Systems
Automatically process customer documents and populate CRM records (Salesforce, HubSpot).
Accounting Software
Extract invoice data and feed directly into QuickBooks, SAP, or ERP systems.
Document Management
Integrate with SharePoint, Google Drive, or custom document repositories.
Integration Methods
RESTful API
Direct API calls for real-time document processing and data extraction.
Webhooks
Real-time notifications and event-driven processing for automated workflows.
SDKs & Libraries
Pre-built libraries for popular programming languages (Python, JavaScript, Java).
Example Integration Workflow
Document Upload
User uploads document to your system
API Call
Your system calls DocuProx API with document
Data Extraction
DocuProx processes and extracts data
Data Storage
Structured data stored in your database
Webhook Integration
Receive real-time notifications when your document processing is complete. Configure webhooks to automatically receive extracted data in your application without polling.
Key Benefits
- ✓ Real-time result delivery via webhook
- ✓ No need for API polling - results are pushed to you
- ✓ Automated integration with your system
- ✓ Fast, non-blocking job submission
- ✓ Supports images, multi-page PDFs, and ZIP batch uploads
How It Works
Configure Your Webhook
Set up a webhook URL from the dashboard for your template. You can also configure custom headers for authentication.
Submit a Processing Job
Use the POST /process-job API with your template_id and document (image, PDF, or ZIP file).
Background Processing
Documents are securely stored and processed asynchronously. Credits are deducted based on document type.
Webhook Result Delivery
Once complete, a POST request is sent to your webhook URL with extracted data in JSON format.
Job Status Values
| Status | Description |
|---|---|
| NEW | Job has been created and is queued for processing |
| UNZIPPING FILE | Extracting files from uploaded ZIP archive |
| UNZIP FILE SUCCESS | ZIP archive extraction completed successfully |
| UNZIP FILE FAILED | ZIP archive extraction failed |
| PROCESSING IMAGE | Processing document images for data extraction |
| PROCESS IMAGE SUCCESS | Image processing completed successfully |
| PROCESS IMAGE FAILED | Image processing failed |
| SUCCESS | Job completed successfully, results are ready |
| FAILED | Job processing failed |
Security & Reliability
- 🔒 API authentication required for job submission
- 🔒 Webhook URLs and headers stored securely
- 🔒 Automatic retry mechanism for failed webhooks
- 🔒 Files automatically deleted after processing
Job Results API
As an alternative to webhooks, you can retrieve processed results using the POST /v1/job-results endpoint.
This is useful when you prefer to poll for results or need to retrieve results at a later time.
Output Formats
- •
json- Structured JSON response - •
csv- CSV format for spreadsheets
Request Parameters
- •
job_id- UUID of the job - •
result_format- json or csv
⏱️ 24-Hour Retention: Job results are stored for 24 hours after processing completion. Make sure to retrieve your results within this window. After 24 hours, results are automatically deleted and cannot be recovered.
Learn More: For detailed API reference, payload examples, and implementation guides, visit the Webhook Integration Documentation.
SDKs
Official SDKs for Python and Node.js. Build powerful document processing applications with our easy-to-use SDK packages that enable automated document extraction and processing.
Python SDK
Official Python package for DocuProx. Simple API for processing files and base64 data using document templates.
pip install docuprox
Node.js SDK
Official Node.js package for DocuProx. Process files or base64 data with a promise-based API.
npm install docuprox
Quick Start Examples
Python
from docuprox import Docuprox
# Initialize client
client = Docuprox(api_key="your_api_key")
# Process a file
result = client.processfile(
"invoice.pdf",
"your-template-uuid"
)
print(result)
Node.js
const Docuprox = require("docuprox");
const client = new Docuprox();
(async () => {
const result = await client.processFile(
"invoice.pdf",
"your-template-uuid"
);
console.log(result);
})();
SDK Features
Easy Integration
Simple APIs to quickly integrate document processing into any application.
Multiple Input Types
Process local files directly or send base64 encoded data to the API.
Async Support
Node.js SDK provides promise-based async operations for non-blocking workflows.
Environment Config
Configure API URL and API keys easily using environment variables.
Batch Processing
Process multiple documents efficiently with batch processing support.
AI Agent Processing
Use AI agents to extract custom data with dynamic prompts.
Learn More: For detailed installation guides, API reference, and advanced usage, visit the SDK Documentation.