Rolling out Claude Code to 15 devs — Vertex + LiteLLM instead of direct API. Good idea or overkill?
Hey, we're in the process of rolling out Claude Code to our 15-dev team and figuring out the right architecture before we commit.
Instead of going direct API, we're leaning toward routing through LiteLLM + Google Vertex AI — mainly for token visibility per dev, model flexibility without touching everyone's config, and audit logs
for compliance. Anyone running Claude Code through a proxy layer like this? How's the latency in practice, and is the observability actually worth it day to day?
---
Second thing: to standardize how the team uses Claude Code, we're
putting together an internal plugin that bundles our own skills, hooks,
and workflows so everyone installs the same thing from our repo instead
of each dev reinventing their setup. Think code review workflows, testing patterns, commit hooks — stuff that should be consistent across the team.
Has anyone maintained something like this long-term? Curious whether it actually sticks or becomes a ghost repo nobody touches after month 2.