u/CosmicJenga

▲ 3 r/ollama

Use Case: Invoice processing with local LLM - Which LLM and hardware requirements?

I want to automate the following invoice workflow locally with an LLM and would like recommendations for:

  • Which local LLM to use
  • Which hardware is realistically required
  • Whether this is better solved with pure OCR + scripting instead of an LLM
  • Which stack/tools you would use overall

Volume is relatively low: around 5–20 invoices per week. But the actual manual amount for all this manual invoice processing takes the coworker 10-15 minutes per invoice (don't ask me how...) - so that they are even complaining about the uncomfortable workflow. This is just for context, because I already came up with "just hurry up", "just chill, it's a small task nonetheless". Both didn't work.

Current (manual) workflow done by the coworkers:

  1. Assign (internal) invoice number
    • Read the next sequential invoice number from an invoice table/database (Invoice Excel-Sheet in Sharepoint)
    • Add this number visibly onto the invoice document (currently manually in green text)
  2. Extract and store invoice data Required fields:
    • Invoice number
    • Invoice date
    • Customer
    • Invoice description/content
    • Amount
  3. Scan / digitize invoice
  4. Upload scanned invoice to SharePoint
    • Folder: “Creditors”
  5. Store SharePoint link
    • Copy generated SharePoint URL
    • Insert URL into the invoice table/database entry (Invoice Excel-Sheet in Sharepoint)

What I am trying to achieve:

  • Fully local processing if possible
  • Automatic OCR + field extraction from PDFs/scans
  • Reliable structured output (data added to the accordant fields in the Invoice Excel-Sheet in Sharepoint)
  • Possibly automatic validation against templates/rules
  • Ideally low maintenance and deterministic behavior

Questions:

  • Which local model would you use for this?
  • Would a small model like Gemma/Qwen/Llama be sufficient?
  • Is GPU acceleration even worth it at this scale?
  • Would an M4 Mac Mini / RTX 4060 / small server already be overkill?
  • Would you combine the LLM with traditional OCR tools (Tesseract, PaddleOCR, OCRmyPDF etc.)?
  • Any practical experiences with invoice extraction pipelines running fully locally?

From the other posts in here I do not get addressed all of my questions. And i totally fail matching my actual business requirements with the required hardware. I used Claude for a while now. But no local LLM ever before.

My most preferred solution would be simply rent a hosted server from any provider and run it on this server.

reddit.com
u/CosmicJenga — 2 days ago