u/Bulky-Priority6824

B9109: preemptive fix for mtp & mmproj fix soon? It appears so

Summary :

spec : process images through the draft context — this directly addresses the mmproj + MTP crash. Previously images (mmproj) couldn't be processed through the speculative/draft context at all. This commit adds that capability. That's the actual fix in progress.

server : fix mtmd draft processing — mtmd is the multimodal (mmproj) handler. Explicitly fixing draft processing for multimodal means they know about the crash and are targeting it.

spec : support parallel drafts — this is infrastructure for running multiple draft models simultaneously, which is required for MTP to work properly at scale with parallel slots.

The combination of all three in one build — multimodal draft fix, parallel draft support, and images through draft context — suggests this is a focused push to get MTP + mmproj working together. PR #22673 might not be far behind.

reddit.com
u/Bulky-Priority6824 — 3 days ago

NCCL-Free Tensor Parallelism on Dual Blackwell PCIe llama.cpp b9095 released!

b9095 finally makes -sm tensor work on dual consumer Blackwell PCIe GPUs without NCCL

If youre on dual Blackwell gpus this look like it could be big.

I'll have my own results for 2x5060ti asap

reddit.com
u/Bulky-Priority6824 — 4 days ago

Msi B550-A Pro Am4 motherboard purchase Dec 26 2025, only one CPU ever installed. Just pulled to swap to x570 which I bought from here. Includes Box and i/o shield in anti-static bag. $85 Shipped

Adt-link ut3g TB4 eGpu dock 1 year old. Excellent condition includes Cable Matters Intel certified cable $75 Shipped

Logitech G309 Light speed wireless mouse like new in box hardly ever used it. $35 Shipped pending to u/ramenlewdle

https://imgur.com/a/NltuIJr

u/Bulky-Priority6824 — 6 days ago

https://huggingface.co/unsloth/granite-4.1-3b-GGUF

These look like perfect compact, fast, tool-calling models designed to parse intent and invoke structured functions.

my intended use

  • Tool/function dispatch — turn natural language into structured API calls
  • Intent routing — classify and direct requests to the right workflow or service
  • Structured data extraction — convert messy input into clean JSON output

https://preview.redd.it/0me0ia0awcyg1.png?width=424&format=png&auto=webp&s=ca421522c716fd7ab2715c425c422cbe7acf21f2

reddit.com
u/Bulky-Priority6824 — 14 days ago

Since about 0.17.1 my ui has been very laggy, especially switching to metrics pages. constant freezing and locking up. broswer close usually fixes it.

the problem is consistent on mobile, pc, and debian using chrome and edge.

reddit.com
u/Bulky-Priority6824 — 16 days ago