▲ 6 r/databricks
Data Quality on Databricks design
Hey, i am deciding between DQX and Deequ for data quality on Databricks, or even to use them both, i think Deequ is amazing because of AnomalyCheck which let us compare batch to batch and make the data flow consistently over time which is very under appreciated, while DQX is amazing at the row level detection. How did u design your data quality on Databricks?
I was thinking using DQX for in-transit Data Quality checks for hard fails, while Deequ for AnomalyCheck for observartion/dashboards/notifications.
u/ptab0211 — 16 hours ago