u/CosmicJenga

I want to automate the following invoice workflow locally with an LLM and would like recommendations for:

Which local LLM to use
Which hardware is realistically required
Whether this is better solved with pure OCR + scripting instead of an LLM
Which stack/tools you would use overall

Volume is relatively low: around 5–20 invoices per week. But the actual manual amount for all this manual invoice processing takes the coworker 10-15 minutes per invoice (don't ask me how...) - so that they are even complaining about the uncomfortable workflow. This is just for context, because I already came up with "just hurry up", "just chill, it's a small task nonetheless". Both didn't work.

Current (manual) workflow done by the coworkers:

Assign (internal) invoice number
- Read the next sequential invoice number from an invoice table/database (Invoice Excel-Sheet in Sharepoint)
- Add this number visibly onto the invoice document (currently manually in green text)
Extract and store invoice data Required fields:
- Invoice number
- Invoice date
- Customer
- Invoice description/content
- Amount
Scan / digitize invoice
Upload scanned invoice to SharePoint
- Folder: “Creditors”
Store SharePoint link
- Copy generated SharePoint URL
- Insert URL into the invoice table/database entry (Invoice Excel-Sheet in Sharepoint)

What I am trying to achieve:

Fully local processing if possible
Automatic OCR + field extraction from PDFs/scans
Reliable structured output (data added to the accordant fields in the Invoice Excel-Sheet in Sharepoint)
Possibly automatic validation against templates/rules
Ideally low maintenance and deterministic behavior

Questions:

Which local model would you use for this?
Would a small model like Gemma/Qwen/Llama be sufficient?
Is GPU acceleration even worth it at this scale?
Would an M4 Mac Mini / RTX 4060 / small server already be overkill?
Would you combine the LLM with traditional OCR tools (Tesseract, PaddleOCR, OCRmyPDF etc.)?
Any practical experiences with invoice extraction pipelines running fully locally?

From the other posts in here I do not get addressed all of my questions. And i totally fail matching my actual business requirements with the required hardware. I used Claude for a while now. But no local LLM ever before.

My most preferred solution would be simply rent a hosted server from any provider and run it on this server.

Use Case: Invoice processing with local LLM - Which LLM and hardware requirements?