
r/Oobabooga

TextGen is now a native desktop app. Open-source alternative to LM Studio (formerly text-generation-webui).
Hi all,
I have been making a lot of updates to my project, and I wanted to share them here.
TextGen (previously text-generation-webui, also known as my username oobabooga or ooba) has been in development since December 2022, before LLaMa and llama.cpp existed.
In the last two months, the project has evolved from a web UI to a no-install desktop app for Windows, Linux, and macOS with a polished UI. I have created a very minimal and elegant Electron integration for that. (Did you know LM Studio is also a web UI running over Electron? Not sure many people know that.)
It works like this:
- You download a portable build from the releases page
- Unzip it
- Double-click textgen
- A window appears
There is no installation, and no files are ever created outside the extracted folder. It's fully self-contained. All your chat histories and settings are stored in a user_data folder shipped with the build.
There are builds for CUDA, Vulkan, CPU-only, Mac (Apple Silicon and Intel), and ROCm.
Some differentiating features:
- Full privacy. Unlike LM Studio, it doesn't phone home on every launch with your OS, CPU architecture, app version, and inference backend choices. Zero outbound requests.
- ik_llama.cpp builds (LM Studio and Ollama only ship vanilla llama.cpp). ik_llama.cpp has new quant types like IQ4_KS and IQ5_KS with SOTA quantization accuracy.
- Built-in web search via the
ddgsPython library, either through tool-calling with the built-inweb_searchtool (works flawlessly with Qwen 3.6 and Gemma 4), or through an "Activate web search" checkbox that fetches search results as text attachments. - Tool-calling support through 3 options: single-file .py tools (very easy to create your own custom functions), HTTP MCP servers, and stdio MCP servers. You can enable confirmations so that each tool call shows up with approve/reject buttons before it executes. I have written a guide here.
- The ability to create custom characters for casual chats, in addition to regular instruction-following conversations:
- OpenAI and Anthropic compliant API with very strict spec compliance. It works with Claude Code: you can load a model and run
ANTHROPIC_BASE_URL=http://127.0.0.1:5000 claudeand it will work. - Accurate PDF text extraction using the
PyMuPDFPython library. trafilaturafor web page fetching, which strips navigation and boilerplate from pages, saving a lot of tokens on agentic tool loops.- Chat templates are rendered through Python's Jinja2 library, which works for templates where llama.cpp's C++ reimplementation of jinja sometimes crashes.
I write this as a passion project/hobby. It's free and open source (AGPLv3) as always:
Hi, I developed/vibe-coded experimental extension for text-generation-webui
https://github.com/Mufty7/Project_Zora
Project Zora is an experimental and local AI companion architecture focused on memory, continuity, reflection, identity state, and more person-like interaction patterns.
It is not a claim of machine consciousness, and it is definitely not production-ready. Its more like research/prototype extension.
attempt to build something closer to a persistent AI companion rather than a stateless chatbot. The architecture is mostly there, but reliability may vary depending on Text Generation WebUI setup.
I’m sharing it because even if the whole system is not polished yet, I think there may be useful ideas here for people working on:
- local AI companions
- memory layers
- LLM continuity
- persona persistence
- reflection loops
- long-term assistant behavior
Status: v0.1.0-alpha
Perhaps there is gold in there, or not, try it yourself.
Personal note: Perhaps somebody who knows what he is doing can develop it further
For a more specific problem description, on 4.7.3, I'm running the server with the same arguments as I always have, and it runs without server-side errors. But, after logging in, it's stuck on the Gradio loading animation... forever. The browser console log shows 404s for all the resources, like the JS and fonts, despite having extracted the whole tar.gz file just like before, and even in a new directory - so I'd expect it to run without issue. Permissions are 770 for my user, recursive across the whole tree.
I haven't changed anything except which executable I'm targeting after the update. So, instead of running start_linux.sh, I'm using the all-in-one executable with the same exact arguments as before:
$ ./textgen --listen --listen-port 7860 --gradio-auth-path /etc/textgen/users.conf
14:06:43-788319 INFO Starting TextGen
14:06:43-798646 INFO Loading settings from
"/opt/textgen/textgen-v4.7.3/user_data/settings.yaml"
14:06:44-422231 INFO OpenAI/Anthropic-compatible API URL:
http://0.0.0.0:5000/v1
Running on local URL: http://0.0.0.0:7860
Maybe it has do with the directory having a different structure now? Really not sure. All I know is that I get browser console logs like this for the fonts, JS, and CSS, and only after logging in (no logs/errors before that):
GET <my-server>/file/css/NotoSans/NotoSans-Medium.woff2
...
404 Not Found
...
^((<my-server> replacing the server's scheme and hostname in this example))
Not very helpful when the files are there:
$ ls app/css/NotoSans/
... ...
NotoSans-BlackItalic.woff2 NotoSans-Medium.woff
NotoSans-Bold.woff NotoSans-Medium.woff2
... ...
$ tree app/js
app/js
├── dark_theme.js
├── global_scope_js.js
├── highlightjs
│ ├── highlightjs-copy.min.js
│ └── highlight.min.js
├── katex
│ ├── auto-render.js
│ └── katex.min.js
├── main.js
├── morphdom
│ └── morphdom-umd.min.js
├── save_files.js
├── show_controls.js
├── switch_tabs.js
└── update_big_picture.js
Changing the hostname does nothing - localhost, 127.0.0.1, my internal IPv4 address, etc. Ports 5000 and 7860 are open in my firewall.
What does work is if I run 4.6 first. I can log in and then stop the server, then run 4.7.3 again. It's all good, it works, until the cache clears (Ctrl+F5) and I get the same hang again. So, there's nothing wrong with my network, firewall, anything like that, since it works on the old version - all that's changed are the portable TextGen files, and those seem to be doing something differently or require some change that isn't documented (since the changes only say to use the new executable).
What also works is just running ./textgen, no parameters at all, and letting the standalone Electron app run. That doesn't work for my use case, though.
Anyone getting this issue, and does anyone have a fix? Thanks!