u/Interesting-Area6418

▲ 88 r/mcp

used to think MCP was just tool calling. now i get it.

Like OpenAI already had tools, Anthropic had tools, Gemini had tools. Didn’t really get why another spec was needed.

Then I hit this at work while wiring the same internal tools across different models and apps. Slack, GitHub, SQL, internal search, Notion etc all had different wrappers and formats depending on where they were being used. At some point I realized half the work was just making everything look consistent.

That’s when MCP finally clicked for me. The value isn’t really “tool calling.” It’s convenience and standardization.

Now I’m seeing the same thing happen one layer higher in infra too. Bifrost, LiteLLM, Kong AI Gateway and similar stuff all seem to be solving the same underlying problem: too many providers, too many SDKs, too many integrations, too many moving parts.

None of this stuff is technically impossible to build in-house. But after a point you realize unified interfaces are just easier to live with.

reddit.com
▲ 8 r/mcp

We hit a point recently where managing the AI stack started becoming more work than building the actual product.

Not even because anything broke badly, but because the ecosystem changed so fast. Earlier it was mostly just one provider and a simple API call. Now every model is good at something different, so suddenly there’s OpenAI for one workflow, Claude for another, Gemini somewhere else, DeepSeek for cost reasons, local models for specific tasks, and then routing between all of them starts becoming its own thing.

Then MCP servers, tool calling, agents, retries, fallbacks, tracing, observability and remote execution slowly enter the picture and the stack becomes way larger than expected.

Feels like that’s why opensource tools like Bifrost, LiteLLM, Kong AI  etc are becoming important now. Not because people want more tooling, but because modern AI systems suddenly have way more moving parts underneath them.

reddit.com
u/Interesting-Area6418 — 2 days ago

Lately been noticing a pretty interesting shift around open source

Few years back it mostly felt like companies open sourced stuff for community goodwill or just dev marketing, but now it genuinely feels like one of the strongest ways to build trust and distribution.

You can kinda see it everywhere now. Hugging Face with open models and AI infra stuff, Bifrost in the LLM gateway space, Supabase with backend infra, Unsloth around LLM fine tuning, and a lot of newer AI/devtool startups too.

Feels like companies are realizing that once developers trust your product, other tech companies slowly start trusting it too. And open source speeds that up a lot because people can actually inspect things themselves instead of just reading polished landing pages.

Especially in AI where people are getting more skeptical of black box products, this feels even more relevant now.

Honestly feels like open source is slowly turning into an actual competitive advantage instead of just a philosophy/community thing. Part of why I recently started building my own open source org too.

reddit.com
u/Interesting-Area6418 — 3 days ago

diff-forge now supports better configurability for Captioning + Resizing normalization

I posted about diff-forge here a few days back and got a lot of feedback + DMs from people training WAN/LTX models.

A common problem people mentioned was captioning and resizing for making image/video datasets fit to training. Which is fair because preprocessing sometimes turns into a bigger headache than the actual training. So I built this tool.

I have added some improvements on configurability of certain features.

For captioning of items, you can now do:

  • first-frame captioning
  • all-frame captioning
    • Choose number of frames to be extracted
    • configurable grid/row layouts
    • auto sizing for extracted frames

The all-frame workflow is especially useful when working with motion-heavy clips where single-frame captions miss too much context.

Also added some good normalization/cropping configurable of the dataset items.

A lot of raw video datasets are messy and inconsistent, so this makes it much easier to get clips into training-ready format without manually patching everything in ffmpeg (At least I used to do that :p).

Been building these tools mainly because we needed them internally, but putting them out publicly has been fun too. Let me know what things I can improve on further.

Repo:

https://github.com/Oqura-ai/diff-forge

Discord:

https://discord.gg/Q586EsTxjh

u/Interesting-Area6418 — 11 days ago
▲ 13 r/OpenSourceeAI+1 crossposts

Building an open source research organization

A few months back we started building internal tools for ourselves while working with LLMs, research workflows, synthetic datasets, RAG pipelines, diffusion training and all that stuff.

Most of it started because we were tired of doing repetitive manual work again and again.

At some point we thought instead of keeping these tools private, why not just open source them and build publicly.

That’s how Oqura started.

One of the projects, deepdoc, unexpectedly crossed 270⭐ on GitHub. It’s basically a deep research agent for local files and folders, so you can generate reports and run research directly on your own docs, PDFs, notes, datasets and codebases instead of only relying on internet search.

Since then we’ve been building more tools around:

- synthetic dataset generation

- deep research based dataset workflows

- diffusion dataset preprocessing

- RAG optimization

- documentation navigation

We’re still students, so honestly a lot of this is just us learning in public while building things we wish already existed.

The best part so far has been random developers and researchers actually using these tools, opening issues, suggesting features and contributing ideas.

We’re probably going to keep building more open source research tools like this. Do share what you guys would like to have or any improvements you required from thse tools

GitHub org: https://github.com/Oqura-ai

u/Interesting-Area6418 — 11 days ago
▲ 53 r/comfyui+1 crossposts

I’ve been working on my startup and had to train diffusion models for animations.

Realized the worst part is not training, it’s the dataset prep.

Especially with stuff like LTX models where things have to follow specific rules like frame counts (8n+1) and resolution constraints. You take random clips and almost nothing fits directly, so you end up trimming, resizing, fixing frames, adding captions… just a lot of repetitive work.

So I built a tool for myself over the weekend to deal with it.

It’s fully open source. Runs local-first with a simple UI + FastAPI backend, uses FFmpeg underneath.

You basically drop your raw videos and it just handles all that stuff. Checks what’s wrong, fixes it, lets you tweak things if needed, and gives you a clean dataset ready for training. Also gives you a good level of control across the whole pipeline, so you’re not locked into rigid preprocessing.

It also has bulk captioning feature across the dataset.

Currently it supports LTX and WAN, and I’ll be adding support for more models soon.

Been using it myself and it made things way smoother, so putting it out.

Also I keep building similar small open source tools like this and putting them out. You’ll find a few more in my GitHub org, so I was thinking of starting a small Discord where people working on similar stuff can share ideas, suggest features, or just discuss what to build next. Feel free to join if that sounds useful.

Repo: https://github.com/Oqura-ai/diff-forge

Discord: https://discord.gg/Q586EsTxjh

u/Interesting-Area6418 — 16 days ago