XML is a game changer for small models (<4B) compared to JSON
I’ve been experimenting for a few months with small models (around 4B parameters) using llama.cpp to automate data extraction from images (flyers, IDs, etc.).
I wanted to share a "non-scientific" observation: if you are struggling with low-resource hardware, stop using JSON and start using XML.
In my experience, small models make a lot of syntax errors when trying to output JSON. However, when I use XML for both the instructions and the desired output, the reliability is night and day.
It’s much easier for a 4B model to follow the tag structure.
example:
<context>
You are a bot that analyzes Spanish party posters and flyers.
</context>
<instructions>
Extract the important information about the event from the provided text/image.
</instructions>
<output_format>
Generate data in XML format following this structure:
<dates>
<date>
<year>YYYY</year>
<month>MM</month>
<day>DD</day>
<full_date>YYYY-MM-DD</full_date>
</date>
</dates>
<location></location>
<summary></summary>
</output_format>