What is Data Quality?

Q: "How do you measure data quality?"

"Five dimensions: accuracy (is the data correct?), completeness (are required fields populated?), consistency (do the same values mean the same thing across systems?), timeliness (is the data current?), and validity (does the data conform to expected formats and ranges?)."

Q: "What causes data quality to degrade?"

"Manual entry errors, system migrations, inconsistent field mappings between platforms, data decay from job changes and email bounces, and lack of validation rules at the point of collection. Quality degrades by default without active maintenance."

The degree to which data is accurate, complete, consistent, timely, and fit for its intended use. Poor data quality undermines every downstream process from personalization to attribution to AI model training.

Data quality is not a binary state. Data is not “good” or “bad.” It is accurate or inaccurate. Complete or incomplete. Current or stale. Consistent across systems or contradictory. The question is always whether the data is fit for the specific use case at hand. A phone number with a missing area code might be fine for identity matching but useless for an outbound sales call.

Downstream systems inherit your inputs

Every system downstream inherits the quality of its inputs. A personalization engine fed stale purchase data recommends products the customer already owns. An attribution model built on duplicate records double-counts conversions. An AI model trained on inconsistent labels produces confident but wrong predictions.

The compounding effect is what makes data quality dangerous. A single bad record is a rounding error. Thousands of bad records embedded across a stack create systemic failure that shows up in campaign performance, reporting credibility, and customer experience but rarely gets traced back to the root cause.

Quality is an input discipline

The first mistake is treating data quality as a cleanup project. Teams run a deduplication exercise, fix the records, and move on. 6 months later, the same problems are back because nothing changed at the point of collection. Quality is a discipline at the input layer, not a periodic fix at the output layer.

The second mistake is assuming data quality is IT’s problem. Marketing generates and consumes more customer data than any other function. If marketing does not define what “good” looks like for its use cases, IT will define it based on infrastructure requirements, which are not the same thing.

Frequently Asked Questions

How do you measure data quality?

Five dimensions: accuracy (is the data correct?), completeness (are required fields populated?), consistency (do the same values mean the same thing across systems?), timeliness (is the data current?), and validity (does the data conform to expected formats and ranges?).

What causes data quality to degrade?

Manual entry errors, system migrations, inconsistent field mappings between platforms, data decay from job changes and email bounces, and lack of validation rules at the point of collection. Quality degrades by default without active maintenance.

Downstream systems inherit your inputs

Quality is an input discipline

Frequently Asked Questions

How do you measure data quality?

What causes data quality to degrade?

Related Definitions

Stay Current, Competitive, and Relevant