u/ChallengeKooky581

Hi everyone, I’m trying to build a good local AI coding setup and I’d like some advice from people who already run coding models locally.

My current PC has an RTX 4070 Ti with 12GB VRAM and 32GB RAM. My idea is to use a stronger cloud model for architecture, planning, and breaking projects into steps, while the local model handles the actual coding and implementation work. Right now I’m mostly interested in finding the best local coding models I can realistically run on this hardware without the experience becoming too slow or unstable. I keep seeing people recommend Qwen Coder, DeepSeek Coder, Codestral, but I’m not sure which ones are actually worth using on a 4070 Ti.

I’d also appreciate advice about quantization, context length, and what runtime/tools work best for coding workflows.

My priority is coding quality and reliability more than raw speed. If anyone has a similar setup, I’d really appreciate hearing what models and configurations worked best for you.

Best local coding models for RTX 4070 Ti 12GB + 32gb ram ddr5?