u/JoeStrout

PixelClaw: an LLM agent for image manipulation

PixelClaw: an LLM agent for image manipulation

I'm developing an open-source LLM agent specialized for working with images. PixelClaw combines:

  • an LLM for conversation, planning, and tool use (supports a variety of LLMs)
  • image generation/AI-based editing via gpt-image
  • background removal via rembg (several specialized models available)
  • pixelization using pyxelate
  • posterization and defringing using custom algorithms
  • speech-to-text (Whisper) and text-to-speech (Kokoro plus HALO)
  • a nice UI based on Raylib, including file drag-and-drop

You can find the project, including a couple of demo videos, at: https://github.com/JoeStrout/PixelClaw

If you find it interesting, I'd really appreciate it if you'd click the star at the top of the page.; that helps me gauge interest. Feedback is very welcome!

u/JoeStrout — 2 days ago
▲ 6 r/raylib+1 crossposts

PixelClaw: an LLM agent for image manipulation

I'm making an LLM agent specialized for image processing. It combines:

  • an LLM for conversation, planning, and tool use (supports a variety of LLMs)
  • image generation/AI-based editing via gpt-image
  • background removal via rembg (several specialized models available)
  • pixelization using pyxelate
  • posterization and defringing using custom algorithms
  • speech-to-text (Whisper) and text-to-speech (Kokoro plus HALO)
  • a nice UI based on Raylib, including file drag-and-drop

PixelClaw is free and open-source at https://github.com/JoeStrout/PixelClaw/ . You can find more demo videos there too. While you're there, if you find it interesting, please click the star ⭐️ at the top of the page; that helps me gauge interest.

u/JoeStrout — 2 days ago