-
95%+
Straight-through processing (STP)within 60 days
-
70%
Fasterinvoice cycle time
-
<1.5%
Extraction error rate(post-validation)
-
30-50%
Reductionin AP processing cost per invoice
Project Overview
A mid-market omnichannel distributor (~35K invoices/month) struggled with growing AP volume, late-payment penalties, and month-end close crunch. Invoices arrived via email, supplier portals, and EDI/PDFs with inconsistent layouts. Manual data entry and exception resolution created delays and errors.
The small AP team spent hours re-keying fields, chasing buyers for PO confirmations, and reconciling partial receipts - so early-pay discounts were often missed while late fees piled up. Month-end spikes pushed invoice cycle time to 3 - 5 days and routinely slipped the close into the second week, frustrating FP&A and vendors alike.
Key Challenges
As a company focused on data analytics and visualization, we deal with a
diverse range of data for our clients.
Here is a look at the type of data sets we take up for
data visualization.
-
Layout variability
Invoices arrive in every shape and size-scanned PDFs, system-generated PDFs, EDI/XML, even photos. Tables run across pages, totals live on a last page, and stamps/notes break the layout. A single extraction approach won’t work across thousands of supplier formats.
-
Matching complexity (2/3-way)
The system must reconcile invoice lines to POs and goods-received notes (GRNs), while handling partial receipts, backorders, and price variances. Freight, discounts, and tax treatments differ by vendor and item, so tolerances and allocation rules are essential.
-
Data quality
Low-quality scans, skewed pages, or faded prints can trip OCR and misread key fields. Line-item parsing is brittle when columns shift. Even after extraction, values need normalization-currency, units of measure, and tax codes-before they’re safe to post.
-
Controls & compliance
Finance needs speed without losing separation of duties or audit trails. Every auto-post must be explainable and traceable, with approvals where required. On top of that, vendor fraud risks (duplicate invoices, bank detail changes, look-alike domains) must be flagged early.
-
Scale & latency
Volume spikes at month-end and during seasonal peaks demand steady performance. The pipeline should parse and validate most invoices in under 10 seconds, support concurrent processing, and handle bulk backfills without slowing down day-to-day work.
Solution overview
TriState Technology delivered a Document-AI powered AP pipeline that ingests invoices from email, S3, and portals; extracts structured fields with a layout-aware model; validates and normalizes entities; performs 2-/3-way matching; and auto-posts clean invoices to the ERP. Exceptions route to a human-in-the-loop queue with smart suggestions and supplier feedback loops.
TriState Technology delivered a Document-AI powered AP pipeline that ingests invoices from email, S3, and portals; extracts structured fields with a layout-aware model; validates and normalizes entities; performs 2-/3-way matching; and auto-posts clean invoices to the ERP. Exceptions route to a human-in-the-loop queue with smart suggestions and supplier feedback loops.
Architecture: What We Built
As a company focused on data analytics and visualization, we deal with a
diverse range of data for our clients.
Here is a look at the type of data sets we take up for
data visualization.
-
Ingestion
Invoices flow in from AP inboxes and supplier portals to an email listener and S3 watcher that accept PDF, image/PDF. Each file is virus-scanned, signature-checked, and de-duplicated using hashing before the immutable original is stored.
-
Parsing & Extraction
A layout-aware vision+LLM pipeline reads headers, totals, and multi-page tables with header/footer detection. It uses vendor-adaptive few-shot hints with a fallback OCR path and outputs strict, schema-validated JSON with per-field confidence.
-
Normalization & Validation
The system enriches records from the vendor master (currency, terms, bank details) and normalizes taxes and units of measure, suggesting GL codes where appropriate. Confidence scoring and configurable thresholds determine which invoices auto-approve versus route to review.
-
Matching & Business Rules
A tolerant 2-/3-way engine reconciles invoice ↔ PO ↔ GRN, handling partial receipts, line-level taxes/discounts, and freight allocation. Fraud checks run in-line (payee/IBAN drift, look-alike domains) and every mismatch is tagged with a clear reason code.
-
Human-in-the-loop (HiTL)
Exceptions land in a focused inbox that shows the PDF side-by-side with extracted fields, field-level highlights, and suggested fixes. Analysts can trigger one-click supplier queries and close the loop without leaving the console.
-
Posting & Integrations
Clean invoices post to the ERP (NetSuite, Zoho, QuickBooks) via hardened n8n connectors with idempotent retries and an outbox/DLQ pattern. Webhooks emit events for downstream BI and notify teams in Slack/Teams.
Reference Tech Stack
As a company focused on data analytics and visualization, we deal with a
diverse range of data for our clients.
Here is a look at the type of data sets we take up for
data visualization.
-
Frontend & Ops
Next.JS Tailwind -
AI/Extraction
Vision-OCR LLM Python Workers -
Core Data
PostgreSQL pgvector -
Orchestration
n8n -
Queues & Cache
Redis SQS -
Analytics
Metabase
Conclusion
-
STP Achievement
95-97%
STP across top 200 suppliers after 60 days; long-tail at ~90% with continuous learning.
-
Cycle Time
<24 hours
Reduced from 3-5 days to under 24 hours for compliant invoices.
-
Cost Reduction
40% ↓
Late fees and missed early-pay discounts down by 40%.
-
AP Staff Time
Reallocated
Staff time reallocated from data entry to vendor management and analytics.