u/Na_S04

Every ML project I've worked on had the same boilerplate CI: MLflow wiring, data validation, metric checks, model registration. Around the fifth project I no longer remembered which config I'd previously fixed the MLFLOW_RUN_ID passing bug in.

So I built a GitLab CI/CD component that turns this into 10 lines:

yaml

include:
  - component: gitlab.com/netOpyr/gitlab-mlops-component/full-pipeline@1.0.0
    inputs:
      model_name: wine-classifier
      training_script: scripts/train.py
      data_path: data/train.csv
      framework: sklearn
      metric_name: accuracy
      min_threshold: '0.85'

Which gives you a full 4-stage pipeline:

validate → train → evaluate → register

validate: schema, nulls, Evidently drift, Great Expectations
train: MLflow autologging (sklearn/PyTorch/TF/XGBoost/LightGBM), GPU support
evaluate: threshold check + optional comparison vs production model
register: GitLab Model Registry, only runs if eval passed

Works on GitLab Free. DVC integration and parallel multi-model training also supported.

Published in GitLab CI/CD Catalog: https://gitlab.com/netOpyr/gitlab-mlops-component

Happy to answer questions — especially on the evaluate stage, compare_with_production was the trickiest part to get right.

I got tired of copy-pasting ML pipeline YAML across projects, so I built a reusable GitLab CI/CD component