DocuProx Python SDK
Official Python package to interact with the DocuProx API for processing documents using templates. Easily integrate AI-powered document extraction into your Python applications.
Installation
Install the DocuProx Python package using pip:
pip install docuprox
Or add it to your requirements.txt:
docuprox>=1.0.0
Configuration
Create a .env file in your project root with your API credentials:
DOCUPROX_API_URL=https://api.docuprox.com/v1
DOCUPROX_API_KEY=your-api-key-here
Or set environment variables directly:
export DOCUPROX_API_URL=https://api.docuprox.com/v1
export DOCUPROX_API_KEY=your-api-key-here
Quick Start
Get started quickly with the DocuProx client:
from docuprox import Docuprox
# Initialize the client (API key required, can be set via DOCUPROX_API_KEY env var)
client = Docuprox(api_key="your-api-key-here") # Uses default URL: https://api.docuprox.com/v1
# Or set custom URL and API key
client = Docuprox(api_url="https://your-custom-api.com/v1", api_key="your-api-key-here")
# Or use environment variables (recommended for production)
# Set DOCUPROX_API_URL and DOCUPROX_API_KEY environment variables
client = Docuprox() # Will use env vars or defaults
# Process a file with a template (sends as multipart/form-data)
template_id = "your-template-uuid-here"
result = client.processfile("path/to/your/file.pdf", template_id)
print(result)
# Process base64 data with a template (sends as JSON)
base64_string = "your_base64_encoded_data_here"
result = client.processbase64(base64_string, template_id)
print(result)
# Process with static values (optional key-value pairs)
static_values = {
"company_name": "Acme Corp",
"invoice_number": "INV-2024-001"
}
result = client.processfile("path/to/your/file.pdf", template_id, static_values=static_values)
print(result)
Batch Processing with Zip Files
You can process multiple documents at once by uploading a zip file. All images and PDFs must be placed in the root of the zip file.
✅ Correct Structure
documents.zip
├── invoice1.pdf
├── invoice2.pdf
├── receipt1.jpg
├── receipt2.png
└── document3.pdf
❌ Incorrect Structure
documents.zip
└── invoices/
├── invoice1.pdf
└── invoice2.pdf
Usage:
# Process a zip file containing multiple documents
result = client.processfile("documents.zip", template_id)
# Or with static values
result = client.processfile(
"documents.zip",
template_id,
static_values={"batch_id": "BATCH-001"}
)
Important: Files in subdirectories will be ignored. Supported formats include PDF, JPG, JPEG, PNG, TIFF, etc.
Static Values
Static values allow you to pass predefined key-value pairs to the processing API. These values will be included in the returned response.
Default
If no static_values are provided, the
response will include values set in the template UI.
Override
If you provide static_values, these will
override the UI defaults.
Example:
# Without static_values - uses UI default
result = client.processfile("file.pdf", template_id)
# Response will include: company_name = "Docuprox" (from UI)
# With static_values - overrides UI default
static_values = {
"company_name": "Acme Corp", # Overrides "Docuprox" from UI
"invoice_number": "INV-2024-001",
"date": "2024-01-15"
}
result = client.processfile("file.pdf", template_id, static_values=static_values)
# Response will include: company_name = "Acme Corp" (from static_values)
# Works with all processing methods
result = client.processbase64(base64_string, template_id, static_values=static_values)
result = client.processjobfile("file.pdf", template_id, static_values=static_values)
result = client.processjobbase64(base64_string, template_id, static_values=static_values)
Async Job Processing
For long-running or batch processing tasks, use the async job workflow:
from docuprox import Docuprox
client = Docuprox(api_key="your-api-key-here")
template_id = "your-template-uuid-here"
# Step 1: Submit a job (returns immediately with job_id)
job_response = client.processjobfile("path/to/your/file.pdf", template_id)
job_id = job_response['job_id']
print(f"Job submitted: {job_id}")
# Or submit with base64 data
job_response = client.processjobbase64(base64_string, template_id)
job_id = job_response['job_id']
# Step 2: Check job status
status = client.getjobstatus(job_id)
print(f"Job status: {status}")
# Step 3: Retrieve results when job is complete (default: JSON format)
results = client.getjobresults(job_id)
print(results)
# Or get results in CSV format
results_csv = client.getjobresults(job_id, result_format="csv")
print(results_csv)
AI Agent Processing
Use the AI agent endpoint for intelligent document processing with custom prompts:
from docuprox import Docuprox
client = Docuprox(api_key="your-api-key-here")
# Define your prompt configuration (field: instruction mapping)
prompt_json = {
"invoice_number": "Extract the invoice number",
"date": "Extract the invoice date",
"total_amount": "Extract the total amount",
"vendor_name": "Extract the vendor name"
}
# Process a file with AI agent
result = client.processagentfile(
file_path="path/to/invoice.pdf",
prompt_json=prompt_json,
document_type="invoice",
custom_instructions="Focus on itemized line items", # Optional
static_values={"company_name": "Acme Corp"} # Optional
)
print(result)
# Or process base64 data with AI agent
result = client.processagentbase64(
base64_data=base64_string,
prompt_json=prompt_json,
document_type="invoice",
custom_instructions="Extract vendor details", # Optional
static_values={"invoice_prefix": "INV-2024"} # Optional
)
print(result)
API Reference
Docuprox(api_url)
The base URL of the DocuProx API.
api_url: The base URL of the DocuProx API.
Single File Processing
processfile(file_path, template_id, static_values=None)
Processes a file by reading it, encoding to base64, and
sending to the /process endpoint with the specified template.
- file_path: Path to the file to process.
- template_id: UUID string of the template to use for processing.
- static_values: Optional dictionary of static key-value pairs to include in processing.
- Returns: JSON response from the API containing document data.
- Raises:
ValueErrorif file not found or API error.
processbase64(base64_data, template_id, static_values=None)
Processes a base64 encoded string by sending it to the
/process endpoint with the specified template.
- base64_data: Base64 encoded string of the image/document.
- template_id: UUID string of the template to use for processing.
- static_values: Optional dictionary of static key-value pairs to include in processing.
- Returns: JSON response from the API containing document data.
- Raises:
ValueErrorif API error.
Async Job Processing
processjobfile(file_path, template_id, static_values=None)
Submits an async processing job by sending a file to the
/process-job endpoint. Returns immediately with a job_id.
- file_path: Path to the file to process.
- template_id: UUID string of the template to use for processing.
- static_values: Optional dictionary of static key-value pairs to include in processing.
- Returns: JSON response from the API containing
job_idandstatus. - Raises:
ValueErrorif file not found or API error.
processjobbase64(base64_data, template_id, static_values=None)
Submits an async processing job with base64 encoded data
to the /process-job endpoint. Returns immediately with a
job_id.
- base64_data: Base64 encoded string of the image/document.
- template_id: UUID string of the template to use for processing.
- static_values: Optional dictionary of static key-value pairs to include in processing.
- Returns: JSON response from the API containing
job_idandstatus. - Raises:
ValueErrorif API error.
getjobstatus(job_id)
Checks the status of a processing job.
- job_id: UUID string of the job to check.
- Returns: JSON response from the API with job status information (e.g., pending, processing, completed, failed).
- Raises:
ValueErrorifjob_idis invalid or API error.
getjobresults(job_id, result_format="json")
Retrieves the results of a completed processing job.
- job_id: UUID string of the job to retrieve results for.
- result_format: Format of results - 'json' or 'csv' (default: 'json').
- Returns: JSON response from the API with job results in the specified format.
- Raises:
ValueErrorifjob_idis invalid, format is invalid, or API error.
AI Agent Processing
processagentfile(file_path, prompt_json, document_type, custom_instructions=None, static_values=None)
Processes a file using AI agent for intelligent document extraction.
- file_path: Path to the file to process.
- prompt_json: JSON object/dict containing the prompt configuration with fields and instructions.
- document_type: Type of document being processed (e.g., "invoice", "receipt", "contract").
- custom_instructions: Optional custom instructions for processing.
- static_values: Optional dictionary of static key-value pairs to include in processing.
- Returns: JSON response from the API with AI-extracted data.
- Raises:
ValueErrorif file not found or API error.
processagentbase64(base64_data, prompt_json, document_type, custom_instructions=None, static_values=None)
Processes base64 encoded data using AI agent for intelligent document extraction.
- base64_data: Base64 encoded string of the image/document.
- prompt_json: JSON object/dict containing the prompt configuration with fields and instructions.
- document_type: Type of document being processed (e.g., "invoice", "receipt", "contract").
- custom_instructions: Optional custom instructions for processing.
- static_values: Optional dictionary of static key-value pairs to include in processing.
- Returns: JSON response from the API with AI-extracted data.
- Raises:
ValueErrorif API error.
Error Handling
All methods raises ValueError when
operations fail. Always wrap calls in try-catch blocks:
try:
result = client.processfile("./file.pdf", "template-uuid")
print(result)
except ValueError as error:
print(f"Error: {str(error)}")
# Handle error appropriately
Key Features
Simplicity First
Intuitive Pythonic API for fast document automation.
Batch Optimized
Effortlessly handle thousands of documents via ZIP batching.
AI-Powered
Leverage advanced AI agents for unstructured data extraction.
Enterprise Ready
Robust error handling and flexible configuration options.