I Built: Holmes - A MacOS AI Agent That Acts On Your Screen Without Asking It To
waitlist: https://www.try-holmes.com/
I'm one of the co-founders of Holmes, a macOS AI agent that watches your screen and acts on it.
Most AI tools are glorified search bars. You open them, describe what you need, and wait. Holmes runs. no prompt, no hotkey.
Under the hood, it's doing continuous screen understanding, not just OCR. It parses what's happening across your active windows, builds context around your current task, and figures out what to do about it. Drafting something, pulling up a resource, catching a follow-up you forgot. It doesn't ask.
native macOS, zero setup.
runs entirely on-device, Apple Accessibility API and Vision framework for screen parsing, Ollama as the local LLM backend, nothing leaves your machine.
Still early. The video above is a UI walkthrough; a full demo is coming once we're further along.
happy to answer any technical questions about how it works.