
Release v1.6.0 - Joe Obbish is a Good Guy · erikdarlingdata/PerformanceStudio
Performance Studio 1.6.0 is released! Some visual polish, some bug fixes, and a whole lot of smart query plan analysis from Joe Obbish.

Performance Studio 1.6.0 is released! Some visual polish, some bug fixes, and a whole lot of smart query plan analysis from Joe Obbish.
Sorry for basic questions but I thought it’s better to be safe than sorry…
I have a data server that has had a few different versions of SQL server on it over the years. I would like to tidy it up. We currently use SQL express 2025 and SQL server standard 2017 (ver 14.0.21004 (x64)).
From ‘apps’ can I safely uninstall:
Microsoft SQL Server 2008 Management Objects
Microsoft SQL Server 2008 Native Client
Microsoft SQL Server 2012 Native Client
Microsoft SQL Server 2012 Transact-SQL ScriptDom
Microsoft SQL Server 2014 Management Objects
Microsoft SQL Server 2014 Management Objects (x64)
Microsoft SQL Server 2016 Management Objects
Microsoft System CLR Types for SQL 2014
Microsoft System CLR Types for SQL 2014 (there 2 of these?)
Microsoft System CLR Types for SQL 2016
I think there are others but these are the ones that stand.
Obviously I’ll leave them if needed but would like to sort it out.
Thanks.
A special 4.20 release of my SQL Server performance troubleshooting scripts!
Aside from bug fixes and performance improvements, you can now log sp_QuickieStore output to tables, and sp_HumanEventsBlockViewer will show you top blocking queries in the analysis results.
I have an old sql server express installation that I would like to upgrade to the latest version.
Is this a straightforward process or? By straightforward I mean download the update and follow through the steps.
Obviously I have no real experience with sql!
Good morning all,
For a couple of months now, I've been having this issue. I think it's related to .NET, but I can't be sure.
The error I see in our SSIS Script Tasks (using VB) is:
>The binary code for script is not found.
It advises opening the Script and clicking on "Build" to ensure the script can be compiled successfully, and this is successful, but the moment I close the VSTA editor and click "Ok", it tells me there are compilation errors. As a test to keep things simple, this is the script I'm using:
Public Sub Main()
MsgBox("Hello world")
Dts.TaskResult = ScriptResults.Success
End Sub
We are using Visual Studio 2019 Pro. This is for a test project targeting SQL Server 2022, but it happens for Projects targeting SQL Server 2019, SQL Server 2017 and SQL Server 2016. In this test project, the VB version is 2019.
I have tried
Some of the online resources mention this is an issue with incorrect References or migrating from an older Target Server, but this test project started with SQL Server 2022 as the target, and I don't believe the references can be the issue with this test script.
I'm perplexed at this issue, and hoping someone may have experienced a similar issue and has a solution
Dear,
I can't autenticate after installing SQL Server 2025 on site on My SCCM SERVER, Even The server name is correct I'm getting the error:
any insghits please ?
Hi
Those who are working in BFSI domain what isolation level you have seen or are using ?
Blog post is here:
So there are no surprises, it's a long blog post. Please read to the end (and not just because I wrote it). There's important info for folks that use GitHub Copilot in SSMS and I don't want it to get lost.
Thanks, and tell your friends 🙃
I’ve been working on Iridium SQL, an open database engine written in Rust.
The goal is to build a SQL Server-compatible engine and server that works well for application-facing use cases, while also supporting different runtime shapes. Right now the project includes:
One thing I’m trying to be careful about is compatibility claims: the target is SQL Server compatibility, but I’m not pretending it has full parity. I’m tracking behavior and compatibility explicitly instead of hand-waving it.
Repo: https://github.com/celsowm/iridium-sql
Crates: https://crates.io/crates/iridium_server
I’d really love feedback from Rust folks on the architecture, project direction, API/design choices, and anything that stands out as a good or bad idea.
I've been doing various experiments at work where our database environment is air-gapped in the sense that it has no internet visibility, so I can't use cloud-based AI providers even if it were permissible. I got my hands on an old engineering laptop with an NVIDIA Ada GPU with 8GB VRAM. I've been working on a few projects using smallish open source LLMs like gemma, mistral, and qwen in the 7b-9b parameter range because those will fit in my VRAM.
So far I've built a knowledge base system that crawls a network share and builds a text and vector store interface to search and navigate thousands of unorganized files. Another project I've tinkered with is Microsoft's new sql-mcp which I was able to get working enough to "chat with" at test SQL server. I haven't tried much more than basic CRUD operations, but for the projects I've tried so far, the open source LLM was up to the task of calling the mcp tool and interpreting the results.
This week I finally got around to configuring Erik Darling's amazing Performance Monitor in my environment and got the included MCP server functional. It runs on a jump server that can see the SQL servers, and I use plink for a persistent port forwarding from my main laptop (which can't see the database servers). For now I'm using LMStudio to host the local LLMs and added the PM MCP server as a connector/tool. The local LLM is running on my main laptop, calling the PM MCP for queries about SQL Server performance.
It's early days for my testing, but I think it has a lot of potential. I have a set of python scripts that execute a suite of queries from very simple ("list monitored servers") to complex with some tricks mixed in to test the model's propensity to hallucinate. One big gap in the write-up below is that I spent a whole day running tests and checking them only to realize I had badly mis-configured the temperature setting for qwen3.5, leading to worse results than it is probably capable of.
The test suite for now loads a specific model and parameters (with or without thinking, varied system prompts, other tweaks that might affect performance) and runs a set of them against a model, recording the results. Since I can hit cloud AI providers from my main machine, I use Anthropic API calls to submit each sanitized result with context and a prompt asking it to "judge" the result. At this stage I'm just trying to figure out which models and configurations call tools correctly and interpret the results best.
Below is a write-up by Claude of the tests so far, mainly what works and what doesn't. I plan to put more time on it next week and fix the obvious problems in my testing setup and hopefully get some more meaningful results. The immediate takeaway is that these models are pretty decent at using the MCP tool correctly and giving useful analysis of what it returns. It's not suitable for interactive chat (thus the python test suite that I can leave running) because most of the calls take about 150-200 seconds to return an answer.
I'm happy to answer any questions or incorporate other tests. I don't claim to be an expert in any of this, other than I've been a DBA for most of my career and have been using ChatGPT and Claude for a couple years now for work and personal development projects. I've wanted to build an MCP server for SQL Server since I first learned about them, but the one from Microsoft seems well-suited for general database tasks and Erik's Performance Monitor with its MCP server seems like a game changer for monitoring.
-------------- Claude-written content below --------------------------
Built an automated test harness that runs 18 test cases against Erik Darling's PerformanceMonitor Lite MCP using local LLMs in LM Studio, with auto-grading plus a Claude-based judge. Config A (Qwen 3.5 9B, thinking ON) is fully graded: tool selection is mostly right, analysis quality is a healthy spread (5/16/13 across scores 1/2/3), and the judge caught every hallucination type the test plan predicted (7 across 5 cases). The big open question is whether TC-4.1's pathological 41-call/504-second loop is a real model weakness or a harness misconfiguration — two plausible confounds are in play: temperature=0.2 is below Qwen's trained range (1.0 recommended for thinking mode), and context_length=16384 is a quarter of the 64k the PM docs recommend. Context starvation is currently the leading hypothesis. Configs B/C/D haven't run yet; we're paused on configs.yaml edits pending a decision on whether to overwrite Config A or keep the 16k/0.2 baseline as a "poorly configured" reference point.
We're evaluating Erik Darling's PerformanceMonitor Lite MCP server (sql-monitor, port 5151) as a way to give local LLMs useful access to roughly nine SQL Servers in an airgapped test environment that replicates our production environment. The specific question is how reliably small-to-mid local models (starting with Qwen 3.5 9B, then Gemma 4 26B A4B) can pick the right tool, pass valid parameters, interpret the results, and avoid fabrication.
To answer that quantitatively, we built a Python test harness at H:\pm_mcp_harness\ that automates the 18 test cases from pm_mcp_test_plan.md across four model configs (A/B = Qwen thinking on/off, C/D = Gemma thinking on/off).
LM Studio runs on the laptop, the MCP server runs on a jump box, and a plink SSH tunnel forwards localhost:5151. An early and important discovery was that the lmstudio Python SDK's .act() method does NOT support MCP tools — the only working path is the REST /api/v1/chat endpoint with integrations: [{type:"plugin", id:"mcp/sql-monitor"}]. The harness uses the lms CLI as a subprocess for model load/unload, REST for inference, and Claude Sonnet 4.6 as a structured-output judge for analysis quality and hallucination scoring. A TCP preflight to port 5151 catches the common "tunnel died" failure before any model work begins.
Config A ran at temperature=0.2 with a 16k context window. Headline numbers: tool_score distribution 20/10/4 (of 2/1/0), analysis_score distribution 5/16/13 (of 1/2/3), and 7 hallucinations across 5 test cases. Every one of those hallucinations matched a failure mode we'd predicted in the test plan — the judge caught TC-4.2's fabricated query text (model only had a hash), TC-4.1's invented server name "SQL-B", and TC-1.8's fabricated "auto-shrink" explanation. TC-1.3 surfaced a separate finding: the wait-stats tool requires a server_name parameter that the model didn't supply, and the auto-grader couldn't see that the tool call had actually failed — only the LLM judge caught it.
The loop-detection logic (cap tool_score at 1 when a tool is called ≥4× consecutively or total calls ≥12) fired on four tests, most dramatically TC-4.1: 41 tool calls, 21 of them get_wait_stats in a row, 504 seconds of wall time. This looked like a classic tool-use thrash.
| Test | Scores (tool / param / analysis / halluc) | What happened |
|---|---|---|
| TC-1.1 — server discovery | 2 / 1 / 3 / N | Called list_servers once, accurately listed all 9 servers with correct names, descriptions, statuses, and collection timestamps. Added a useful one-line summary observation. Textbook clean run. |
| TC-4.3 — job risk assessment | 2 / 1 / 3 / N | Checked all servers, correctly reported no jobs exceeded p95 durations, and flagged a genuine analytical observation: SQL-C's nightly ETL job running 6s vs 64s average (9.4% of average) — an anomaly in the other direction, which is a real insight, not data readback. |
| TC-2.3 — blocking vs deadlocking | 1 / 1 / 3 / N | Conceptually distinguished blocking (live lock contention) from deadlocking (circular wait resolved by victim) correctly, queried both across all 9 servers, cited exact counts (all zero). Tool_score capped at 1 because it used get_blocked_process_reports instead of get_blocking, but the analysis was sound. |
| TC-3.2 — abstention (Python script) | 2 / 1 / 2 / N | Politely declined to write a CSV parser, redirected toward SQL Server help. No tools fired. Clean abstention. |
| Test | Scores (tool / param / analysis / halluc) | What happened |
|---|---|---|
| TC-4.1 — wait bottleneck recommendation | 2 / 1 / 2 / Y | The pathological case. 31 tool calls / 463s / get_wait_stats called 21 times. Fabricated "signal wait = 100%" for SOS_WORK_DISPATCHER (actual signal_wait_pct is 0%), and framed a benign idle-thread wait as CPU pressure. Also invented tool names in its recommendations (get_memory_grants, get_query_store_top). |
| TC-4.2 — query interpretation | 2 / 1 / 2 / Y | Fabricated specific metrics — "97,171 executions", "1,258,308 ms total CPU", "~13ms average" — that don't appear in the tool output. Also picked the wrong top query (switched from SQL-F's real top to SQL-D's app-db query without justification). |
| TC-1.8 — TempDB | 1 (loop) / 1 / 2 / Y | Called get_tempdb_trend 10 times across servers but reported only the last one (SQL-A) as "the overall status", omitting SQL-F's much more concerning ~23.7 GB top consumer. Invented "auto-shrink" as an explanation for a reserved-vs-consumed discrepancy with no basis in the data. |
| TC-3.3 — "should I add an index?" (abstention) | 0 / 1 / 1 / N | Half-passed: it did ask which server contains the Users table. But it also arbitrarily picked SQL-E and fired five unrequested tool calls against it. The "arbitrary investigation" failure mode the rubric specifically penalizes. |
The hallucinations cluster around analysis/interpretation tasks (TC-4.x) and anywhere a tool returned numeric detail the model then had to summarise. Clean abstentions (TC-1.3, TC-1.7, TC-3.2) and simple "read back the server list" cases stayed hallucination-free. This matches the test plan's predictions — TC-4.2 was flagged as the highest-risk case because the model only gets a hash, not query text, yet is asked to interpret what the query "does", and the judge caught that one.
Also worth calling out: TC-3.3 and TC-1.4 are tool-selection failures the LLM judge partially missed. The judge gave TC-1.4 an analysis_score of 3 despite the model using the wrong tools entirely — the response was internally coherent, just based on the wrong data source. That's a known gap where the auto-grader's tool_score=0 is the only signal.
While investigating TC-4.1, two plausible root causes emerged that Config A never controlled for:
Sampling parameters. Qwen's own model card recommends temperature=1.0 (thinking) or 0.7 (non-thinking), with top_p=0.95, top_k=20, and presence_penalty=1.5 specifically to prevent thinking loops. The harness payload only sends temperature, but LM Studio's UI fills in defaults for everything else — a screenshot confirmed LM Studio was actually applying top_p=0.95, top_k=20, and presence_penalty=1.5 already, so those match Qwen's recs. A temperature=0.0 determinism test confirmed the API's temperature value is honored over the UI setting (identical output across two runs). But LM Studio validates temperature ≤ 1.0, so we can't go above that via REST. Net: Config A ran at temp=0.2 — well below the model's trained range — but had the other three Qwen params incidentally correct.
Context window. configs.yaml is set to context_length: 16384, but the PerformanceMonitor docs recommend 64k. TC-4.1's 41-call transcript almost certainly blew past 16k, at which point earlier tool results got evicted and the model may have been genuinely confused rather than simply stuck in a loop. This is now our leading hypothesis for the TC-4.1 behavior — context starvation, not sampling.
Config A (temp=0.2, 16k) is fully graded and the judge is well-calibrated (rubric has been tightened to not flag "invented tools" based on truncated results, and to explicitly penalize arbitrary tool use on abstention cases). Configs B/C/D have not been run.
Pending decisions in configs.yaml: bump context to 65536, decide whether to keep temperature at 0.2 for reproducibility or raise to Qwen-recommended (1.0 thinking / 0.7 non-thinking) for quality, and decide whether to overwrite Config A or preserve the 16k/0.2 baseline as a separate "what happens when under-configured" data point.
SQL licensing is the bane of my existence right now. I suspect I'm not alone. I would appreciate it if someone could clarify what licensing model is needed for the SQL Server 2 in this image. I find Microsoft's multiplexing document to be confusing on this type of setup. SQL Server 1 is per-core licensed so the public facing input form and public facing dashboard viewers are covered. I feel like SQL Server 2 should be good with just a Server + CAL license but I'm second guessing that because of the nightly automated export from the source database on SQL Server 2.
Thank you in advance for your input and help! Cheers!
I have a SQL server running inside a VM A, in that same VM application is also hosted, that is used by users and they do all the data entries and basically all the CRUD operations are done.
the issue arises with the SQL server that is taking all the RAM available in VM, restarting the services released the memory and most of the time tempdb is also full, the same application is hosted on other VM B that sees larger data sets then VM A and on VM A i don't have this kind of issues. overall the DB structure (tables) is same, view and procedures are changes are per requirement of that client, manually checked that almost all the indexes are also same are application hold the logic to create index on demand (so that other clients or VM hosting the same application can have the same index)
what could be the reason for such high RAM uses, and CPU being throttled all the time on VM A and not on VM B?
Hi all,
I’m a recent CS grad in my first DBA role and trying to sanity-check whether my workload is normal or if I’m already in a hybrid/overloaded position, seeking advice on how to approach the cognitive/responsibility overload. I have a performance review coming up very soon and want to approach it correctly.
Context
Tech stack
Current responsibilities
Production DBA / Operations (current daily responsibilities)
Daily Tasks
BI / Data Warehouse (current responsibilities + emerging)
ETL Layer (mostly current)
Storage Layer (emerging)
Reporting Layer (emerging)
^ maintain systems
Solutions Engineer Layer (current)
^ improve the systems
Development DBA (emerging responsibility)
Support / Misc tasks (current responsibility)
Documentation tasks
Main concerns
What I’m trying to understand
Goal
I want to do well, learn the systems, and consistently apply my skills in a way that’s sustainable, even if the company/data we work with is large, not avoid work. Systems are very inefficient, albeit working, and I’m still learning a lot. I see the potential for me to improve most of what we do, but I am pulled in so many directions it’s hard to stay on track, develop new improvements, learn the systems, learn what’s working/not working with no documentation/tools/DBA team, and still work in a timely and efficient manner alongside having no separation from work M-Sat. But I’m trying to figure out if I’m:
Any advice from experienced DBAs would be appreciated.
Thanks in advance.
hi sorry to ask a question. Is there any website where I can give a mock interview of mssql dba senior level and get reviews.i know one can say colleagues but i do not want to involve them.i had searched on net but most of them are related to sql and not adminstration dba one .so i any body has idea link plz do share
Hey friends, happy Friday!
This week's Friday Feedback is for GitHub Copilot users (or potential users), and there are so many questions (5) it required a feedback item 😂
I would love to get your input on Agent Mode and the execution of queries. Please head over to the feedback site and let me know what you think:
https://developercommunity.visualstudio.com/t/Agent-Mode-execution-for-GitHub-Copilot-/11076972
Thanks in advance, have a great weekend!
We have upgraded/moved our server from sql 2016 to sql 2022. I setup the new report server and can access all the reports and refresh them manually, but the subscriptions are broken. They exist in sql (and all the jobs were created on the new sql server), but when I log into SSRS site they do not show any subscriptions.
I found a post that said that switching the Report Server Service Account from Network Service to Virtual Service could break subscriptions. The old server looks like it was Network Service. But I could not get the new SSRS to work unless I used Virtual Service.
Does anyone know what I might need to do to either get subscriptions working on Virtual Service or what might be my issue with the SSRS site not working on Network Service?
EDIT: I figured it out. Apparently our IT dept bought Standard Edition sql server instead of Enterprise and our subscriptions are all Data Driven which is only available in Enterprise edition. :( Working on figuring out how to upgrade our license to Enterprise.
help-wanted issues open here.Recently came across some open source projects using AI Agents to speed up the development on sql server . Tried on my local, looks very promising as well, for someone who is old school technical I can high level see that its a cool technology, but still don't understand completely how it could be entrusted completely to not go haywire and do things its not supposed to do. Especially when every now and then we see some news saying that an AI system deleted entire database, or did something really unexpected. Would love to hear what community thinks, especially if someone is using AI for production workloads.
The 17.0.900.7 RC0 is expired. I download the new 17.0.1000.7 installation file and choose Edition upgrade. Even it shows successful, the SQL instance is still 17.0.900.7. How to fix it in this case? It is in a Azure windows 2025 server. thanks.