Case Study

DATEV EXTF Automation

Law FirmGermany2025

A German tax firm with about fifteen employees spent several person-days per month preparing receipts for DATEV (German tax software standard). We built an OCR and classification pipeline that processes incoming and outgoing receipts automatically, pre-assigns VAT, and produces a clean DATEV EXTF export. Bookkeeping now only reviews exceptions.

Starting situation

Starting situation

The firm serves mid-size clients, including several e-commerce companies with high receipt volumes — hundreds to thousands of receipts per client per month. Preparation ran in the classic way: receipts arrive by email or via the client portal, visual review, manual or semi-automated entry, manual assignment of account and offset account, manual VAT classification, then the DATEV export. Per client, this took one to two work days per month of pure data entry. Across 20 comparable clients the volume added up significantly.

Solution

Solution

We built a pipeline that accepts receipts via email or upload, runs OCR for structured fields (date, receipt number, issuer, gross, net, VAT rate, country), classifies the transaction, and converts it into the DATEV EXTF format. The classification logic was calibrated jointly with the bookkeeper across multiple iterations: which account/offset-account combinations apply to which receipt types, when BU key 240 applies for intra-EU deliveries, how to handle non-EU transactions. OCR uses Claude Vision, classification runs in Langflow, data is held in PostgreSQL on our Hetzner infrastructure in Frankfurt. Optionally, an on-premise variant is available — some of the firm's clients have particularly strict data requirements.

Result

Result

The bulk of receipts now passes through the pipeline automatically. The bookkeeper only reviews flagged exceptions — new receipt types, unusual amounts, missing required fields. Monthly preparation time per client is significantly reduced. The firm reports internally on roughly 70% time savings in the data entry step; this number refers to direct entry work, not the entire monthly close. GDPR-compliant through German hosting, on-premise variant for clients with elevated requirements.

What we learned

What we learned

Three takeaways from this project. First: DATEV EXTF is more mechanical than expected, once the firm's account logic is documented cleanly. We spent almost the entire first week working with the bookkeeper to verbally map out the firm's SKR03 logic — those hours paid off a hundred times over later. Second: OCR is never 100%. The pipeline must be built from the start with the knowledge that exceptions exist, and the workflow must route those exceptions elegantly into human review without creating frustration. Third: the firm's clients react very differently to "AI in bookkeeping". We learned that an on-premise option is not just a technical feature — often it is what a firm needs to keep its more conservative clients.

Similar problem?

Send us a short note and we reply within one business day.

Request a project