This article briefly recaps the value of quality data in the supply chain and the challenges with today’s automated mass sharing of information with internal and external customers. This article addresses how we all got here and what we can do about it. There are four critical success factors of 1) Scope, 2) Team, 3) Process and 4) Technology and this article will give you the stepping stones to be successful in cleaning your critical data.
Executive Summary of the issue: Think of data cleansing as system immunization.
Competition for business has fuelled system complexities. We are clearly in the information age. There is more data being sent around the world today than ever, and this is only going to grow exponentially. To have the competitive edge today you have to be able to share clean, accurate data quickly. It’s not enough just to have a good product, your customers have to recognize the right product, color and size from a internet catalogue, the system has to process the correct item on the order and send the information to distribution systems, distribution has to pick and ship the right item. All of this is dependent on each system being able to share and communicate the same product item data. The pace with which companies are forced to operate and to compete globally has taxed existing systems and increased their inefficiencies. Mergers and Acquisitions have introduced new ERP implementations, forced incorporation of legacy systems, merged processes, combined products, and consolidated customers. Technology has moved forward but so has the creativity of the humans forcing the data requirements into fields and applications to meet their business needs to keep existing customers satisfied or to capture new sales. Gartner research firm indicates that the average Fortune 100 organization has more than eight data stores, 15 information platforms, 10 critical systems, and hundreds to thousands of business applications. This is not an unusual problem; almost all large Fortune 500 companies have been putting Band-Aids on their data and systems for years. Unfortunately rubber bands and bailing wire isn’t going to fix their problems now, there is no silver bullet to fixing data. Systems that support human decision-making should be systems which possess clean data. To get valid data, companies must create a common business language- business rules. Once established, now bringing disparate data sources together is easy and sets the foundations for business performance management (BPM). So with all of that said the next round of questions seems simple….How do I know my data is “dirty”? This question is not as simple as it appears, most management think because they are using the data in production systems to run their business then it must be clean….or clean enough…right? Wrong!
What is dirty data?
Dirty data occurs when reality is different than the data captured and stored, therefore it is dirty. It can also be described as anomalies in the data values that renders a wrong representation. Simply put data impacts: -Ability to process orders
-Ability to share data among internal systems
-Ability to share data electronically with external sources -Reliable reports
-Correct information for good decision making
Dirty data manifests itself in many different anomalies, below are just a few: •Discrepancies in the structure of the data items and specified format •Irregularities
•Integrity constraint violations
•Missing values (part or whole records)
What causes dirty data?
•Duplicate or overlapping data from :
•Business Mergers and Acquisitions
•Inherited Legacy Systems
•Poor or undisciplined data capture
•Internal static data repositories
•Automating, merging or implementing new ERP and CRM solutions •External file feeds...