Process documents with the Nutrient DWS API. Use this skill when the user wants to convert documents (PDF, DOCX, XLSX, PPTX, HTML, images), extract text or tables from PDFs, OCR scanned documents, redact sensitive information (PII, SSN, emails, credit cards), add watermarks, digitally sign PDFs, fill PDF forms, or check API credit usage. Activates on keywords: PDF, document, convert, extract, OCR, redact, watermark, sign, merge, compress, form fill, document processing.
Resources
2Install
npx skillscat add pspdfkit-labs/nutrient-agent-skill/nutrient-document-processing Install via the SkillsCat registry.
Nutrient Document Processing
Process, convert, extract, redact, sign, and manipulate documents using the Nutrient DWS Processor API.
Setup
You need a Nutrient DWS API key. Get one free at https://dashboard.nutrient.io/sign_up/?product=processor.
Option 1: MCP Server (Recommended)
If your agent supports MCP (Model Context Protocol), use the Nutrient DWS MCP Server. It provides all operations as native tools.
Configure your MCP client (e.g., claude_desktop_config.json or .mcp.json):
{
"mcpServers": {
"nutrient-dws": {
"command": "npx",
"args": ["-y", "@nutrient-sdk/dws-mcp-server"],
"env": {
"NUTRIENT_DWS_API_KEY": "YOUR_API_KEY",
"SANDBOX_PATH": "/path/to/working/directory"
}
}
}
}Then use the MCP tools directly (e.g., convert_to_pdf, extract_text, redact, etc.).
Option 2: Direct API (curl)
For agents without MCP support, call the API directly:
export NUTRIENT_API_KEY="your_api_key_here"All requests go to https://api.nutrient.io/build as multipart POST with an instructions JSON field.
Operations
1. Convert Documents
Convert between PDF, DOCX, XLSX, PPTX, HTML, and image formats.
HTML to PDF:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "index.html=@index.html" \
-F 'instructions={"parts":[{"html":"index.html"}]}' \
-o output.pdfDOCX to PDF:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.docx=@document.docx" \
-F 'instructions={"parts":[{"file":"document.docx"}]}' \
-o output.pdfPDF to DOCX/XLSX/PPTX:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}' \
-o output.docxImage to PDF:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "image.jpg=@image.jpg" \
-F 'instructions={"parts":[{"file":"image.jpg"}]}' \
-o output.pdf2. Extract Text and Data
Extract plain text:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}' \
-o output.txtExtract tables (as JSON, CSV, or Excel):
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}' \
-o tables.xlsxExtract key-value pairs:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"extraction","strategy":"key-values"}]}' \
-o result.json3. OCR Scanned Documents
Apply OCR to scanned PDFs or images, producing searchable PDFs with selectable text.
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "scanned.pdf=@scanned.pdf" \
-F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}' \
-o searchable.pdfSupported languages: english, german, french, spanish, italian, portuguese, dutch, swedish, danish, norwegian, finnish, polish, czech, turkish, japanese, korean, chinese-simplified, chinese-traditional, arabic, hebrew, thai, hindi, russian, and more.
4. Redact Sensitive Information
Pattern-based redaction (preset patterns):
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","preset":"social-security-number"}]}' \
-o redacted.pdfAvailable presets: social-security-number, credit-card-number, email-address, north-american-phone-number, international-phone-number, date, url, ipv4, ipv6, mac-address, us-zip-code, vin, time.
Regex-based redaction:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","regex":"\\b[A-Z]{2}\\d{6}\\b"}]}' \
-o redacted.pdfAI-powered PII redaction:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"ai_redaction","criteria":"All personally identifiable information"}]}' \
-o redacted.pdfThe criteria field accepts natural language (e.g., "Names and phone numbers", "Protected health information", "Financial account numbers").
5. Add Watermarks
Text watermark:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","text":"CONFIDENTIAL","fontSize":48,"fontColor":"#FF0000","opacity":0.5,"rotation":45,"width":"50%","height":"50%"}]}' \
-o watermarked.pdfImage watermark:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F "logo.png=@logo.png" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","imagePath":"logo.png","width":"30%","height":"30%","opacity":0.3}]}' \
-o watermarked.pdf6. Digital Signatures
Sign a PDF with CMS signature:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cms","signerName":"John Doe","reason":"Approval","location":"New York"}]}' \
-o signed.pdfSign with CAdES-B-LT (long-term validation):
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cades","cadesLevel":"b-lt","signerName":"Jane Smith"}]}' \
-o signed.pdf7. Form Filling (Instant JSON)
Fill PDF form fields using Instant JSON format:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "form.pdf=@form.pdf" \
-F 'instructions={"parts":[{"file":"form.pdf"}],"actions":[{"type":"fillForm","fields":[{"name":"firstName","value":"John"},{"name":"lastName","value":"Doe"},{"name":"email","value":"john@example.com"}]}]}' \
-o filled.pdf8. Merge and Split PDFs
Merge multiple PDFs:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "doc1.pdf=@doc1.pdf" \
-F "doc2.pdf=@doc2.pdf" \
-F 'instructions={"parts":[{"file":"doc1.pdf"},{"file":"doc2.pdf"}]}' \
-o merged.pdfExtract specific pages:
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf","pages":{"start":0,"end":4}}]}' \
-o pages1-5.pdf9. Render PDF Pages as Images
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf","pages":{"start":0,"end":0}}],"output":{"type":"png","dpi":300}}' \
-o page1.png10. Check Credits
curl -X GET https://api.nutrient.io/credits \
-H "Authorization: Bearer $NUTRIENT_API_KEY"Best Practices
- Use the MCP server when your agent supports it — it handles file I/O, error handling, and sandboxing automatically.
- Set
SANDBOX_PATHto restrict file access to a specific directory. - Check credit balance before batch operations to avoid interruptions.
- Use AI redaction for complex PII detection; use preset/regex redaction for known patterns (faster, cheaper).
- Chain operations — the API supports multiple actions in a single call (e.g., OCR then redact).
Troubleshooting
| Issue | Solution |
|---|---|
| 401 Unauthorized | Check your API key is valid and has credits |
| 413 Payload Too Large | Files must be under 100 MB |
| Slow AI redaction | AI analysis takes 60–120 seconds; this is normal |
| OCR quality poor | Try a different language parameter or improve scan quality |
| Missing text in extraction | Run OCR first on scanned documents |
More Information
- Full API reference — Detailed endpoints, parameters, and error codes
- API Playground — Interactive API testing
- API Documentation — Official guides
- MCP Server repo — Source code and issues