u/Emperoraltros

Hybrid cloud + local LLM stack for a real-time game coaching app, what I learned

Lead dev at a small indie studio. Just shipped fine-tuned personas for a CS2 coaching tool with a hybrid architecture I wanted to share because the design tradeoffs were interesting.

Stack:

  • Primary inference: Groq cloud, Llama 3.3 70B for the text coach, Llama 4 Scout 17B for vision, with 8B fallback on rate limits
  • Local fallback: Llama 3.1 8B base with 4 LoRA adapters fine-tuned per persona (harsh, analytical, patient, pattern-observer), served via Ollama + llama.cpp
  • Routing: cloud first if tokens available, local fallback if cloud unavailable or user is on free tier

The reason for the hybrid: cloud gives you the quality ceiling, local gives you the privacy/cost floor. Free-tier users and offline play hit Ollama. Paid users hit Groq for the better reasoning. Same persona prompts across both paths, just different backends.

What I learned on the local fine-tuning side (the part most people in this sub care about):

What worked:

  • Hand-authored training data beat synthetic at small scale. 200 hand-written examples per persona outperformed 2000 generated ones. Synthetic sounded right but was structurally wrong, too verbose and hedge-y.
  • Voice spec documents before training data. 2-3 page spec per persona (what words they use, pacing, failure modes), then training data written against the spec. Without the spec, training data drifts.
  • Personas with focused scenario coverage beat personas trying to be good at everything.

What failed:

  • LoRA dropout above 0.05 with rank 8 on a 500-example dataset overfit hard. Loss dropped to 0.05 in 2 epochs and the model memorized training data verbatim, including meta-instructions like "respond in the voice of...". Retrained with dropout=0, loss landed at 1.2, usable.
  • Pattern-recognition persona was the hardest by far. Multi-round implicit-state reasoning is genuinely hard at 8B. Closed-form math (round equity, buy decisions) was trivial in comparison.

Infrastructure stuff:

  • GGUF export is fragile. Version mismatches between llama.cpp and conversion tooling cost me 2 days. Lock the conversion env, version-pin everything.
  • Eval harness is its own problem. Loss numbers don't tell you if a persona feels right. I run the same scenario through all 4 personas and read outputs side by side. That subjective check caught more issues than any automated metric.

What I'm still figuring out:

  • Hybrid routing observability. When cloud falls through to local, the user experience differs subtly. Capturing where the handoff happened and how output quality compares is something I haven't solved cleanly.
  • Post-deployment feedback loop. User thumbs up/down becomes the next training set, but quality-gating is hard. Novice flagging an expert call as wrong is anti-signal. Working on a skill-weighted feedback system but it's not done.

Happy to answer questions on hyperparameters, hybrid routing decisions, GGUF wrangling, persona design, eval harness, whatever. The hybrid architecture stuff in particular doesn't get talked about much in this space, mostly because everyone's either pure cloud or pure local. There's a real middle ground.

Discord if you want to follow along: https://discord.gg/tTE5aFeq

Steam page: https://store.steampowered.com/app/4659510/Game_Demon

reddit.com
u/Emperoraltros — 3 days ago
▲ 1 r/csgo

Genuine question for ranked players about in-round coaching

Working on a CS2 coaching tool that gives feedback during freezetime — round equity math, what the opponent's economy implies, that kind of thing. Solo dev project.

The thing I keep going back and forth on: would in-round coaching actually help during ranked, or would it just add noise during the intense part of the round?

For people who've actively tried to climb: do you want the "opponent has $4400, 60% likely to save" call between rounds? Or would you find it distracting?

Genuinely trying to figure out if I'm building something useful or solving a fake problem.

Happy to share the Steam link in the comments if anyone wants to see what it looks like.

reddit.com
u/Emperoraltros — 5 days ago
▲ 5 r/IndieGameWishlist+2 crossposts

I trained 4 AI coaching personas on consumer hardware for a CS2 coach. Here's what worked and what completely failed.

Solo dev, 14 months in. Shipping an AI coach for Counter-Strike 2 that talks to you between rounds. Four personas, each with a distinct voice (harsh, analytical, warm, pattern-observer). Hand-tuned QLoRA fine-tunes on top of a 7B base. Local inference via llama.cpp, runs on 8GB VRAM.

Wanted to share what I learned because I burned a lot of time on stuff that didn't work.

What worked

Hand-authored training data beats synthetic almost every time. I tried generating training data with a frontier model and it produced output that sounded right but was structurally wrong. Too verbose, too hedging, too willing to repeat itself. 200 hand-written examples per persona outperformed 2000 generated ones.

Voice-spec documents before training data. I wrote a 2-3 page voice spec for each persona before writing a single training example. What words they use, what they don't, what their pacing is, what their failure modes are. Then I wrote training data against the spec. Without the spec, training data drifts.

Closed-form economy reasoning is the easy part. Round equity given economy state, save thresholds given individual money, force-buy probabilities. These are math problems. The model gets them right consistently because the patterns are rigorous.

What completely failed

Dropout above 0.05 with rank=8 LoRA on a small dataset (500 examples). Hit the overfit wall hard. Loss dropped to 0.05 in 2 epochs and the model memorized the training data verbatim, including meta-instructions like "respond in the voice of...". Retrained with dropout=0 and got loss to 1.2. Usable.

Trying to make all four personas equally good at all four scenarios. Personas have natural strengths. Demon (harsh) is amazing at confrontational moments and bad at long pep-talks. Veteran (mentor) is the inverse. Forcing each persona to cover all scenarios diluted them. Once I let each persona be specifically good at 2-3 scenario types, all four got sharper.

Pattern-observer persona (Savant) is the hardest by far. Pattern recognition in CS2 demos requires the model to track sequences across multiple rounds and reason about opponent tendencies. The closed-form economy stuff is trivial in comparison. Still iterating.

The infrastructure stuff nobody talks about

GGUF export is fragile. Random version mismatches between llama.cpp and the conversion tooling cost me 2 days. Fix: lock the conversion environment, version-pin everything.

Training data hygiene matters more than training data volume. 30 minutes of contamination check (looking for repeated phrases that'd cause echo chamber outputs) saved me a 6-hour retrain.

Persona evaluation is its own problem. Loss numbers don't tell you if the persona feels right. I built a tiny eval harness where I run the same scenario through all 4 personas and read the outputs side-by-side. That subjective check caught more issues than any automated metric.

What I'm still figuring out

The post-launch feedback loop. Real users will give thumbs up/down on coaching calls, and that becomes my next training set. But quality-gating user feedback is hard. A Silver-tier player flagging a correct Global-Elite-level read as "wrong" is anti-signal. Working on a rank-weighted feedback system but it's not done.

Happy to answer specific questions on any of this. LoRA hyperparameters, GGUF weirdness, persona design, evaluation harness, whatever. The actual reason I'm posting is the dev-process stuff above is what I wish I'd read 14 months ago.

If you want to see what the result looks like, the Steam page is here: https://store.steampowered.com/app/4659510/Game_Demon. Launching this month.

u/Emperoraltros — 2 days ago

For anyone who saw my earlier post asking for testers — Game Demon is on Steam now (page just got approved, in technical review for ~5-7 days). Wanted to give an update since some of you helped shape it.

Quick refresher on what it is:

AI coach that watches your match in two ways at once. Reads Valve's Game State Integration (the official API — knows your money, round count, team economy, halftime swap, bomb state) AND samples your screen periodically through a visual coach layer. So it has both ground truth from Valve's API and the actual visual context of what you're seeing. Most other tools do one or the other. Tactical callouts during the round, demo review after the match.

What I want to flag specifically for this community:

The base coach ("Demon" — fine-tuned Llama 3.1 8B) runs LOCALLY on your machine. No subscription, no cloud, your data doesn't leave your PC. If you have an 8GB+ GPU you can use the local model indefinitely with the one-time app purchase. The cloud personas (different coaching styles) are paid but optional — long-term plan is for every persona to run both locally and cloud, you pick what fits your hardware.

On the "is this safe" question: GSI is Valve's official integration. The visual coach uses standard screen capture APIs — no DLL injection, no memory reading, no game modification. Same surface streamers use for HUD overlays. Nothing writes back into the game.

If anyone wants beta access before launch, Discord: https://discord.gg/c8tjQRjgMR

Steam page: https://store.steampowered.com/app/4659510/Game_Demon/

Happy to answer real questions about the GSI integration, the visual coach pipeline, the persona system, why I went local-first, or what the local model can and can't do at 8B params.

https://www.altrosstudios.games/gamedemon/

reddit.com
u/Emperoraltros — 8 days ago
▲ 0 r/INAT

Hey everyone,

I'm Elijah, Founder and Lead Developer at Altros Studios LLC. We're building Celestium — a 10v10 third-person MOBA in Unreal Engine 5.6 set on Solace, a living alien planet where flight-capable heroes fight across a massive three-tier vertical map.

This isn't a pitch deck dream. Engineering is far along — 38,000+ lines of C++ across 200+ files, 4 playable heroes with full 8-ability kits, flight mechanics, 31 gameplay systems, SpacetimeDB multiplayer, RL-trained AI bots, and a complete game flow from boot to post-match. We have a concept artist delivering hero concepts and a world designer greyboxing the map in UE5. We need a 3D character modeler to start turning concepts into game-ready meshes.

What you'd be doing:

Taking hero concept art (full turnarounds with material notes, expression sheets, and silhouette tests from our concept artist) and modeling them into game-ready 3D characters

Sculpting, retopologizing, and UV-mapping hero meshes for real-time use in UE5

Creating PBR materials and textures (Substance Painter workflow)

Rigging and skinning characters for UE5's animation system (or collaborating with a rigger if we bring one on)

Working from detailed design documentation — every hero has lore, faction identity, visual unifiers, and gameplay role specs already written

Starting with our MVP heroes: Solarion (Aetherian, astral-material body, solar flares, no metal armor), Nyx (Voidborn, pure shadow with starlight, void daggers), Erla (Celestial Nymph, bioluminescent vines, organic), and Gwen (human tech scavenger, mismatched pirate gear, drone companion)

Our art direction targets hyper-realistic quality — think Baldur's Gate 3, not Fortnite. Cosmic-fantasy aesthetic with bioluminescent alien flora, celestial energy, and faction-specific visual identities.

What we're looking for:

Portfolio showing character modeling for games (realistic or semi-realistic, organic and hard-surface)

Proficiency in Maya, ZBrush, or Blender for sculpting and modeling

Experience with Substance Painter for PBR texturing

Understanding of real-time topology constraints (polycount budgets, LODs, UE5 Nanite awareness)

Familiarity with rigging basics (full rigging expertise is a plus but not required)

Ability to work from concept art turnarounds and written design briefs

Comfortable working async on Discord

What we offer:

A small, tight-knit team where you're not just another cog — your work directly shapes what gets built

Thorough design documentation to work from — no guesswork, real briefs with clear direction

Concept art already being delivered by our concept artist — you're not designing from scratch, you're translating strong concepts into 3D

Equity with a standard vesting schedule — real ownership in what we're building, not just a line on a resume

Structured workflow with ClickUp, Discord, and clear sprint goals

Daily check-ins (non-negotiable — a few bullet points, takes 5 minutes) and a weekly standup

NDA and IP Assignment required before onboarding

About the team:

We're currently a core of 4 — myself on engineering, a concept artist delivering hero concepts, a creative producer/world designer building out the map greybox in UE5, and a sound designer. We also have a strategic advisor with 20 years of industry experience who runs his own startup. We use Perforce for version control and Discord for comms. The culture is collaborative, casual, and direct. No corporate speak, no ego games. We're here because we love building things and we think the MOBA genre is ready for something new.

A note on AI: Altros Studios builds around AI-assisted workflows — we expect our team to use AI tools where they're effective. Modelers are encouraged to integrate tools like Hunyuan3D for base mesh generation, ComfyUI for texture reference and concept iteration, and AI-assisted retopology tools into their pipeline. Your sculpt, your topology, your textures — AI helps you get there faster. That said, we do not accept fully AI-generated deliverables. Your portfolio should be your own hand, and your work for us will be too. The human drives the vision. AI accelerates the process.

If this sounds like your kind of project, drop your portfolio below, DM me, or reach out at elijah@altrosstudios.games. I'd love to chat.

— Elijah, Altros Studios
altrosstudios.games

reddit.com
u/Emperoraltros — 17 days ago