Preview

Omm Data Cleaning

Good Essays
Open Document
Open Document
584 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Omm Data Cleaning
Data Cleansing/Scrubbing
The concept of information cleansing / scrubbing is to improve the quality of organizational information and thus the effectiveness of decision making businesses must formulate a strategy to keep information clean. This is a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information.
Specialized software tools use sophisticated algorithms to parse, standardize, correct, match and consolidate data warehouse information. This is vitally important because data warehouses often contain information from several different databases, some of which can be external to the organization.
In a data warehouse, information cleansing occurs first during the ETL process and second on the information once it is in the data warehouse. Companies can choose information cleansing software from several different vendors including Oracles, SAS, Ascential Software, and Group1 Software. Ideally, scrubbed information is error free and consistent.

Text Book - Business Driven Technology - Baltzan/Philips - Page 100 - 101 Definition: Data Cleaning

A process used to determine inaccurate, incomplete, or unreasonable data and then improving the quality through correction of detected errors and omissions. The process may include format checks, completeness checks, reasonableness checks, limit checks, review of the data to identify outliers (geographic, statistical, temporal or environmental) or other errors, and assessment of data by subject area experts (e.g. taxonomic specialists). These processes usually result in flagging, documenting and subsequent checking and correction of suspect records. Validation checks may also involve checking for compliance against applicable standards, rules, and conventions.
The general framework for data cleaning (after Maletic & Marcus 2000) is: Define and determine error types; Search and identify error instances; Correct the errors; Document error instances and error

You May Also Find These Documents Helpful

  • Powerful Essays

    Ibm 211 Week 3

    • 4383 Words
    • 18 Pages

    IBM Telecommunications Data Warehouse V8.4 and IBM Health Plan Data Model V8.4 help accelerate development of cost-efficient industry data warehouse solutions…

    • 4383 Words
    • 18 Pages
    Powerful Essays
  • Satisfactory Essays

    This document is a proposal for building a data warehouse architecture that will consolidate and transform data into useful information for the purpose of decision-making and for establishing a new function that offers a broad array of decision support services to all units at ABC Retail Chain Corporation. Executives and decision-makers often need information to analyze the past, describe current circumstances, and anticipate the future. Presently, decision-makers across the Institute rely on hard copy reports or Excel Sheets to provide information. Typically, any request for information is forwarded to the operational areas of the Organization, which provide hard copy reports reflecting the data gathered in their functional area. To analyze and transform data into useful information, decision-makers and their staff have to manually re-enter the non-integrated data into their own mini-systems. This type of operation hinders the ability of decision making and the executives are either drowning in too much data with no option to analyze it or too little data, which means they are back to square one and must request additional information. Often executives receive multiple, conflicting information or information that is based on incomplete assumptions about the types of analysis required.…

    • 641 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    Audit and organize the data. Understanding your data before cleaning improves the efficiency of your project and reduces the time and cost of data cleaning. Understand the purpose, location, flow, and workflows of your data before you start.…

    • 522 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Bis Midterm Sheet

    • 1467 Words
    • 6 Pages

    A data warehouse is to extract and clean data from operational systems and other sources to store and catalog that data for processing by BI tools. Data warehouses can include external data purchased from outside sources. Meta data is kept in the data warehouse. Physically, a data warehouse consists of a few fast computers with very large storage devices.…

    • 1467 Words
    • 6 Pages
    Good Essays
  • Good Essays

    | * The data warehouse of St George bank supports the integrated data among different departments * Data from different departments can be accessed freely * Integrated data from the data warehouse is more beneficial and creates more opportunities and BI for all departments (1+1=3) * “Most departments extract what they need from the warehouse using customer relationship management and BI applications without intervention.” * “They have access to all the data, can create their own filters, their own campaigns.”…

    • 341 Words
    • 1 Page
    Good Essays
  • Good Essays

    John Lewis Hardware

    • 505 Words
    • 3 Pages

    Data will need to be processed in business for marketing purposes such as John Lewis. For example all information need to be processed such as product information or jobs details all that need to be processed in data information. Once it’s finished possessing the outputted data it will show the information of the product which will be outputted data. Every department in john Lewis will make data and other functions of the business will gained from external areas or sources. If the data is incorrect or the outputted information isn’t accurate enough then it will be…

    • 505 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Canadian Tire Case

    • 656 Words
    • 3 Pages

    In order to reach his goal, there are many issues that need to be addressed. The first issue is that in order to ensure that the data in the data warehouse is correct, there needs to be strong data governance by all users. The 2nd concern is that users of the current systems will not use BI; they might stick to what they’re comfortable with. Another problem he came across was that one of the key sponsors of the project had left the company, which brought the project to a halt in 2004. In order to keep the project moving, it is critical that there is buy-in from the Company’s upper-management. Another crucial issue was that data was inconsistent, due to the fact that data was collected and managed differently. If this data were loaded into BI in its current state, it would be useless. Garbage in will always result in garbage out. An issue in the company culture was also present. Users were concerned that there were not enough resource to dedicate to cleaning the data. Executives were reluctant to move away from the tools they were currently using.…

    • 656 Words
    • 3 Pages
    Good Essays
  • Good Essays

    * Recipient of a Leadership in Data Warehousing Award from the Data Warehousing Institute (TDWI), the premier association for data warehousing.…

    • 1302 Words
    • 6 Pages
    Good Essays
  • Satisfactory Essays

    Discuss Data Quality

    • 351 Words
    • 2 Pages

    As a HIM professional data quality is very crucial within the health care industry. The HIM professional must provide accuracy when collecting patient data. Data Quality Management (DQM) is defined as the business processes that ensure the integrity of an organization's data during collection, application (including aggregation), warehousing, and analysis. AHIMA,(2012). While the health care industry still have a long road ahead in reaching their goal pertaining to the national health care data standards, there are necessary steps by providing…

    • 351 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    10. data cleansing is a critical aspect of data warehousing that includes reconciling conflicting data definitions and formats organization-wide.…

    • 2021 Words
    • 9 Pages
    Powerful Essays
  • Powerful Essays

    With the global marketplace becoming increasingly competitive and the insatiable appetite for business information, the volume of data that must be managed and assimilated is growing at an exponential rate. Global corporations require standard processes, consistent data to enable global consolidation and the ability to transform raw data into business intelligence to support better decision making. In many cases, regions, countries and even cities have different finance systems, computerized maintenance management systems, purchase order processing and call centre platforms with inconsistent data capture and coding. Just to add to the…

    • 3125 Words
    • 13 Pages
    Powerful Essays
  • Good Essays

    A data warehouse is a database that stores current and historical data of potential interest to decision makers throughout the company.[1] In the Terrorist Watch List Database case, the information about suspected terrorists are consolidated and standardized from multiple government agencies so that the information can be centralized into a single list, from which different agencies can communicate and share information with each other. This centralized database is a specific example of data warehouse. In this case, the data warehouse containing the relevant information of individuals from each agency’s list enhancing effectiveness of communication between agencies as well as increase the consistency of information from separate databases.…

    • 860 Words
    • 4 Pages
    Good Essays
  • Satisfactory Essays

    Confidentiality

    • 395 Words
    • 2 Pages

    The meaning of purging is to clear any unwanted information. When this information is found in the database, the patient should be notified and the doctor also. After their notification, the information is should be changed to the correct information and the patient should be notified.…

    • 395 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    MANAGING DATA RESOURCES

    • 1048 Words
    • 18 Pages

    c h a p t e r 7 MANAGING DATA RESOURCES 7.1 © 2002 by Prentice Hall LEARNING OBJECTIVES • COMPARE TRADITIONAL FILE ORGANIZATION & MANAGEMENT TECHNIQUES • DESCRIBE HOW DATABASE MANAGEMENT SYSTEM ORGANIZES INFORMATION * 7.2 © 2002 by Prentice Hall LEARNING OBJECTIVES • IDENTIFY TYPES OF DATABASE, PRINCIPLES OF DATABASE DESIGN • DISCUSS DATABASE TRENDS * 7.3 © 2002 by Prentice Hall MANAGEMENT CHALLENGES • TRADITIONAL DATA FILE ENVIRONMENT • DATABASE APPROACH TO DATA MANAGEMENT • CREATING DATABASE ENVIRONMENT • DATABASE TRENDS * 7.4 © 2002 by Prentice Hall MANAGEMENT CHALLENGES 1.…

    • 1048 Words
    • 18 Pages
    Satisfactory Essays
  • Best Essays

    Introducing Database System

    • 4276 Words
    • 18 Pages

    Introduction 1.1 Manual File System 1.1.1 Disadvantages of Manual File System 1.2 Computerised File System 1.2.1 Disadvantages of Computerised File System 1.3 Database System 1.4 Database 1.4.1 Characteristics of Database 1.5 Database Management System 1.5.1 Functions of Database Management System 1.5.2 Advantages of Database Management System 1.5.3 Disadvantages of Database Management System…

    • 4276 Words
    • 18 Pages
    Best Essays