Data Quality

The Five Ways Dirty Data Costs Businesses Money

Dirty data in their systems costs U.S. companies anywhere from $2.5 to $3.1 trillion each year.

Dirty data in their systems costs U.S. companies anywhere from $2.5 to $3.1 trillion each year. Errors and omissions in master data in particular are notorious for causing costly business interruptions. And there’s no insurance you can buy against dirty data. It’s a simple fact of life many businesses grudgingly live with, but barely acknowledge. Our goal in this piece is to help you understand the areas where Dirty Data causes profit leakages in businesses, how to recognize them, and a little on what you can do about them.

Here are five ways dirty data could be costing your business money….

1. Wrong Conclusions and Time Wasted

Stop me if you’ve heard this one before:

Analyst goes into a meeting with their first bright, shiny new dashboard from the new multi-million dollar Data Warehouse.

A few minutes in, one executive starts to notice an anomaly in the data being presented. Something doesn’t add up, so they pull up their system and check it. Yes, definitely a mismatch.

Smelling blood in the water, other employees start to pile on until the poor analyst is battered and beaten, all for just doing their job.

This scenario plays out every day in companies across the US when even slightly dirty data is unknowingly used for analytics. The way most businesses detect the problem is to run right smack into it.

Apart from this disastrous meeting, which has been a waste of time, the BI team might spend months debating their findings with a disbelieving business audience. The net result: lots of time wasted, incorrect conclusions from analysis, and eventually nobody really trusts the data.

2. Operational Interruption

Dirty data and operational mistakes go hand in hand to cost businesses trillions every year.

The address was wrong, so the package wasn’t delivered.
The payment was applied to the wrong duplicate account.
The callback never reached the client because the number was wrong.

On the bright side, operational errors due to bad data often get addressed first and often because they’re so visible. They are the squeaky wheel of the organization, and for good reason.

If you’re trying to improve operational efficiency, make sure you start with as clean data as possible. And protect your data to keep it clean. Don’t import records into your CRM until you’ve checked them. Your operation is a clean, pristine lake with a healthy eco-system. Dirty data is toxic pollution that will disrupt that natural harmony.

3. Missed Opportunities

In our opinion, this one is the costliest of all by far, but it flies below the radar since it’s rooted in opportunity cost; However, it really deserves far more attention.

When a company lapses into and accepts a culture of “We have dirty data”, lots of great ideas for new initiatives never get off the ground, which results in billions of dollars in missed commercial opportunity every year.

New business ideas and innovations for current practices are constantly shot down because “We don’t have that data.” Or “We have that data, but it’s only 50% accurate.” Even worse, sometimes these innovative new ventures proceed into the incubator stage with high startup costs, only to explode on the launchpad because the data can’t be trusted or turns out to be dirty.

4. Poor Customer Experience

Every executive will admit customers are the #1 priority. Customers are also, of course, real people. But to your front line sales and service reps – the ones actually interacting with customers by phone and email, the crucial link is the data your company holds about that customer.

Think about it, the outcome of every service call, product order, subscription purchase is based in large part on the data your company has on its customers. If that data is inconsistent across customers, or just downright dirty and inaccurate, bad things start to happen. Products ship out to the wrong address. The wrong products are recommended. Returns go to the wrong place. Sales calls go to old, disconnected numbers. Inaccurate bills go out, payments are applied incorrectly.

If your business is one with multiple departments and business lines, clients can start to feel pretty underappreciated when one department knows their birthday and children’s names, and another can barely look up their account number by last name.

5. Time Wasted Cleaning It Up

Cleaning up dirty data is the first step in eradicating it. But it is a terribly time consuming process, and often very manual. Cleaning dirty data that’s been neglected for years can take years itself and is tedious and costly. Appending and fixing contact information can cost as much as one dollar per record. The average cost to clean up one duplicate record ranges from $20-$100 with everything factored in. Cleaning up thousands of duplicates and incomplete rows must be done carefully to avoid compounding the errors.

The cleanup of dirty data starts with a good understanding of the root causes, which takes time to forensically analyze what went wrong and when. Cleaning up dirty data is one step in a larger process, but it has the potential to wreck everything and force you into a reset. Worse, there’s the very real possibility that the issues go undetected and somehow end up in a final work product. Rest assured that someone will read your work product and see what’s wrong with it.

Often what’s wrong with the data is not fully understood and some cleaning efforts actually make it worse. (Ever have a sort error on a table column get loaded back into production? Fun times.)

It’s best to position the cleaning of data early in your larger set of processes. This means planning out how data will be processed and understanding what can’t be properly digested by downstream databases, analytics packages and dashboards. While some problems, such as issues with minor punctuation marks, can be handled in post-processing, you shouldn’t assume this will happen. Always plan to clean data as early as possible.

Luckily, we are seeing new strides in Artificial Intelligence that make this process easier and reduce the time from years down to days and weeks.

Automated Data Profiling (https://qastaging.wpengine.com/products-data-analysis-tools/) can shave months off the “finding out what’s wrong” phase of a data cleanup, giving a statistical readout of each issue by category so the problems can be prioritized and addressed in the right order.

Automated Data Enrichment (https://qastaging.wpengine.com/products-data-analysis-tools/data-enrichment/)and data append help with deduplication and merging of duplicate records.

Finally, Automated Data Modeling (https://qastaging.wpengine.com/products-data-analysis-tools/aipowered-data-modeling-augmented-data-management/) helps to round out the view of key entities, resulting in a more consistent customer experience, for example.