
▲ 0 r/opensource
PixelClaw: an LLM agent for image manipulation
I'm developing an open-source LLM agent specialized for working with images. PixelClaw combines:
- an LLM for conversation, planning, and tool use (supports a variety of LLMs)
- image generation/AI-based editing via gpt-image
- background removal via rembg (several specialized models available)
- pixelization using pyxelate
- posterization and defringing using custom algorithms
- speech-to-text (Whisper) and text-to-speech (Kokoro plus HALO)
- a nice UI based on Raylib, including file drag-and-drop
You can find the project, including a couple of demo videos, at: https://github.com/JoeStrout/PixelClaw
If you find it interesting, I'd really appreciate it if you'd click the star at the top of the page.; that helps me gauge interest. Feedback is very welcome!
u/JoeStrout — 2 days ago