Hello everyone,

After training models I kept hitting the same problem. Metrics looked fine. But I did not really know if the model was behaving correctly.

So I built a small Python library called diagnost.

What My Project Does

diagnost is a lightweight library for diagnosing trained ML models.

In one call it can evaluate performance, check calibration, detect drift, assess subgroup performance, and flag dataset issues like missing values, correlations, and outliers.

Example:

import diagnost

report = diagnost.evaluate(model, X_test, y_test, task="classification")
report.summary()

It can also compare models and export results to JSON.

Target Audience

Mainly data scientists and ML practitioners. Right now it is more of a lightweight tool for notebooks and experimentation. Not exactly production ready.

Comparison

Most libraries focus on training models or give you raw metrics.

diagnost focuses on post training checks and tries to give clear, structured diagnostics in one place. It also adds things like calibration checks, drift detection, and subgroup analysis without much setup.

If you want to contribute and/or have ideas, please get in touch.

PyPI: https://pypi.org/project/diagnost/
GitHub: https://github.com/Eklavya20/diagnost

It is still early, so I would really appreciate any feedback. Especially what checks you usually run manually.

u/TurquoiseBlu

Looking for feedback: lightweight Python library for ML model diagnostics

What My Project Does

Target Audience

Comparison