
From debugging on prod to actually fixing errors
I was never a fan of debugging production errors on axiom, sentry, or pretty much any other tool existent already, simple or not simple to setup, once i start receiving errors finding them in a multitude of dashboards, learning proprietary languages, writing sqli-ish queries to find logs just seems too much, i shall have it there, visible, yelling at me that there is something wrong and i have to fix it.
I started building this tool, Loguro, a few months back, put a lot of features on it, the principle was simple, be fast, reliable, no context switching and human queries, at least at the beginning.
Ingestion, Rust + Hono on top of bun as web server, multiple layers of compression, made sure no log is escaping and there is no hiccup, tested and tested and tested again and again, it was fast, 45k+ req/s in single and 500+k logs/s in batch, instant visibility in dashboard.
Dashboard, also had to be fast, like really fast, started pushing logs, milion after milion, 50M logs queried in under 500ms.
Filter bar, human syntax, very human, level:error message:"oops something broke" from:"yesterday at 10:05" to:"yesterday at 12" context.user_id:1009 this can go on, there are a lot, a lot of queries available, but... i couldn't stop there, i like human syntax, but i like also commands, i like my filter bar to do things, not only request data from the backend, so, plugins appeared.
Plugins are basically a way for me to expand all the main inputs i have in the app, command bar, filter input and filter on the log page. On each input we can run commands to create integrations with jira, github, linear, etc., create a ticket on jira without switching context, sending the ticket and log details to slack directly, lots of plugins like that but there are two of them which made my life waaaay easier, tested for 2 months already with a project for a client i was having some issues.
--investigate and --share:md are two plugins which work on this exact sequence, find an error, enter the log, type --investigate and hit cmd+enter, if github integration is configured and github codebase accepted the ai will analyze the log, fetch relevant data from github, try multiple times until it founds something and spits out the issue it found + a solution and where needed suggestions to improve the logging to make it understand more next time. Once investigation is done and you know what happened, --share:md will create a shareable link with a markdown and one with a html view(in case some human needs it), i personally, take the markdown URL, go to claude and give it the link directly, let him fix, review and test then push.
Now what? Hope that will never happen again? Nope, pin that log, i want to see and watch it permanently for few days, add the pin a note in case i forget what was about and leave it there, if the issue will come again i will see it in the dashboard(i can create alerts, but i don't like my hearth rate spiking)
If for that issue i created a task in jira from the app, in case the issue appears again i have a badge on it, showing me that this happened again, clicking on it i have the full details about what happened WITHOUT LEAVING TO JIRA, i am debugging, if i switch tabs i might get lost. Every logging system has a retention period, want more pay more, mine does not work like that, you get max 120 days on the scale plan, but all plans benefit of memory. What this means? If i create a task from a log and time goes by, 1,2,10 months, my noise logs are pretty much gone, but that very log that triggered the task will still be there. FOREVER.
All my servers are monitored through Loguro, it accepts json and OTLP, thus from all the servers i have and all the clients servers i manage are sending logs to it, failed ssh attempts, fail2ban bans, success attempts, memory/cpu/network spiex, again, EVERYTHING goest to Loguro, full visibility.
That's Loguro, a logging system i built to suit my needs as a developer in the new era of AI development.
I am the only user for now. If anyone is interested seeing it here is a link.