u/Material_Tone_6855

Just got a new baby for my AI local journey - Need some Suggestions

Just got a new baby for my AI local journey - Need some Suggestions

I just got a new baby for my AI Journey. I'm coming from a 4060 8GB ( capable to run properly the Qwen 3.6 35B A3B ). But I need more VRAM and compute, so I was searching for the GPU with the best price/performance on the market.

So I got this 3090 with 24gb of memory ( 3 times the memory on the 4060 ).

I still don't know if I'm going to keep the 4060 to run small models and the 3090 to run dense with mtp.

Any suggestion?

P.S. power supply upgrade on the way.

P.S.S.

My current setup:
- CPU: AMD Ryzen™ 9 7900X × 24
- RAM: 64GB DDR5 5600MHZ
- MoBo: Gigabyte Technology Co., Ltd. B650 GAMING X AX V2

u/Material_Tone_6855 — 3 days ago

I guys, I'm playing with the fork of llama-server introducing support for MTP, and before downloading hundreds of gb of "dumb" models I'm here to ask for your help.

What's the best 35B A3B quant for agentic stuff?

I've tried the official Q4_K_M with KILO as coding agent, and even if it's pretty fast on my 8GB 4060, it's not able to properly close tool's tags while generating stream responses.

I've also tried to use the suggested params ( temp, top_p and so on ) but still that's the only response I get.

Before downloading a different quant, I want to know which model are u using and what results are you getting.

P.S. yesterday I build from scratch the fork llama-server version with mtp support, so I'm ready for models that support it.

u/Material_Tone_6855 — 7 days ago

Hi everyone!

I’m the creator of d1-prisma, a bridge CLI designed to make managing Cloudflare D1 migrations with Prisma 7 a breeze. I just released version 1.1.1, which brings some highly requested features to improve the DX, especially for complex projects.

What is d1-prisma? Cloudflare D1 is great, but bridging it with Prisma's migration workflow can be tricky. d1-prisma automates the generation of SQL migrations via prisma migrate diff and applies them to your local or remote D1 instances, keeping your Prisma state in sync.

What’s new in v1.1.1:

  • 📦 Native Monorepo Support: Dealing with .wrangler state files in a monorepo (Turborepo/Nx) used to be a pain. Now you can define a wranglerDataDir in your config. This automatically points to your root .wrangler state and passes the --persist-to flag to Wrangler commands.
  • 📄 New JSON Configuration: I’ve simplified the config to a single d1-prisma.config.json.
  • 🔍 JSON Schema & Autocomplete: The new config comes with a built-in $schema. Point your editor to it and get instant autocomplete and validation for all parameters (databasemigrationsDirwranglerDataDir, etc.).
  • 🚀 Prisma 7 Ready: Fully compatible with the new Prisma 7 configuration style (prisma.config.ts).
  • 🛠️ Better Debugging: A new --log debug flag that traces exactly how paths are resolved, which environment variables are being used, and which shell commands are being fired.
  • ⚡ Lighter & Faster: Switched to a zero-transpiler approach for configuration loading, making the CLI faster and the bundle significantly smaller.

Quick Monorepo Example: In your packages/database/d1-prisma.config.json:

{
  "$schema": "./node_modules/d1-prisma/schema.json",
  "database": "my-db",
  "wranglerConfig": "../../wrangler.toml",
  "wranglerDataDir": "../../.wrangler/state"
}

Check it out on GitHub: https://github.com/Hiutaky/d1-prisma

Install it:

npm install -g d1-prisma
# or
bun add -g d1-prisma

I’d love to hear your feedback or answer any questions about using Prisma with D1. If you find it useful, feel free to drop a star on GitHub!

u/Material_Tone_6855 — 8 days ago

I'm playing with the latest MoE Qwen Model, running it locally using LM Studio and a 4060 8GB. The setup was super easy and after some tests I was able to find the sweet spot to "saturate" my VRAM and obtain nice performance both during prompt processing and generation.

The real problems is that the model is not able to properly compose the "tool_calls" array, I can play the system prompt as much as I want but still the Array comes empty, while the model try to use the tool during the reasoning process.

( attaching my current setup )

Btw, I'm testing classical HTTP requests ( no SSE or stream ).

https://preview.redd.it/91nv8ghwcbyg1.png?width=1426&format=png&auto=webp&s=bd12478fbd800347b10b1a889a5266b1c2fe7bc3

reddit.com
u/Material_Tone_6855 — 14 days ago