What is data quality, anyway? What is data quality, anyway?
Published date January 27, 2025
Surveys support critical decisions in consulting, private equity, and beyond. But what happens when the data is flawed — and no one even knows about it? Our new docuseries, Data Quality Simplified, reveals everything that’s wrong with the survey industry (hint: it’s not just fraud), and what you can do about it.
Share this piece
Executive summary.
Survey data quality means collecting reliable, authentic responses from people who are honest, attentive, and engaged. In online survey research, bad data is not caused only by fraud; it also comes from poor sample sources, weak respondent incentives, painful survey experiences, and overreliance on data cleaning after fieldwork. This episode of Potloc’s short docuseries explains why consulting and private equity teams should treat data quality as an end-to-end process: start with better respondent sourcing, design surveys people can realistically complete, compensate respondents fairly, optimize for mobile, and use robust quality controls to remove low-quality responses. The takeaway is that data cleaning can reduce risk, but trustworthy survey insights depend on attracting the right respondents and protecting response quality from the start.
The data quality problem in the survey industry
Whether you know it or not, the results of surveys directly influence your daily life: deciding whether a park or a school is built in your neighborhood, the design of your favorite juice bottle, or if your bold business idea sees the light of day.
When done well, surveys help inform and validate decisions in both the public and private sectors.
But what happens when the data is bad?
IBM estimated that poor data quality cost the U.S. $3.1 trillion in 2016, a figure that’s undoubtedly higher today.
Data quality is what makes or breaks a survey, and if you’ve ever worked with survey providers, you’ve probably heard the term over and over — and over again.
But what does it really mean, and how can we achieve it? Let’s simplify it.
Let’s compare it to something all of us can relate to: food.
Think about the fast food industry — It’s quick, affordable, and nicely packaged, but not the most nutritious.
Most survey providers try to clean the data just before serving it to clients, but by then, it’s often too late. Rinsing and cooking ingredients full of antibiotics or pesticides can reduce the risk, but won’t increase the quality.
A good meal should start at the source — with nutritious ingredients — and end with strict quality assurance. The same goes for surveys.
History of online sampling: How did we get here?
Let’s back up for a moment to understand how we got here. Surveys as we know them today are an ancient idea with a modern twist. Their roots go back to ancient Rome, where emperors used censuses to gather data about their citizens. Fast-forward to the 1800s, and researchers began using structured methods to test intelligence and map social conditions.
In the 1930s, George Gallup evolved surveys into a science with public opinion polling. The 1970s brought telephones into homes, making respondents more accessible.
Then came the internet. By the early 2000s, surveys were widespread, and of decent quality, with online panels that put great care into incentivizing participants.
If you’ve ever taken an online survey, you can relate.
JD Deitch calls this the “Enshittification of programmatic sampling,” where the only people motivated to endure painful surveys are those who want the payout. But if people the average person is not bothering with them, who is?
So, is there hope?
As such, data quality hinges on two factors: removing low-quality respondents through robust quality controls and attracting high-quality respondents by improving their survey experience. Decisions like fair compensation, survey length, and mobile optimization play a huge role.
Sampling providers and buyers must understand that they have control over these decisions and that, together, they can create a more transparent and reliable survey ecosystem.
Get the next few episodes delivered straight to your inbox.
In the next episodes of the Data Quality Simplified Series, we’ll dive deeper into these factors, uncover behind-the-scenes stories, showcase the latest research, and hear from experts on what it will take to truly move the industry forward.
- Research on research by Potloc: Why data cleaning doesn’t fix a risky source.
- Research on research by Potloc: Survey design matters more than you think.
- Research on research by Potloc: Meet the super-respondents distorting your insights.
- Forbes’ write-up on the IBM study that estimated the cost of bad data to be U.S. $3.1 trillion in 2016