IT433 Data Warehousing and Data Mining — Data Preprocessing — 1 Data Preprocessing • Why preprocess the data? • Descriptive data summarization • Data cleaning • Data integration and transformation • Data reduction • Discretization and concept hierarchy generation • Summary 2 Why Data Preprocessing? • Data in the real world is dirty – incomplete: lacking attribute values‚ lacking certain attributes of interest‚ or containing only aggregate data • e.g.‚ occupation=“ ”
Premium Data analysis Data management Data mining
Lab – Data Analysis and Data Modeling in Visio Overview In this lab‚ we will learn to draw with Microsoft Visio the ERD’s we created in class. Learning Objectives Upon completion of this learning unit you should be able to: ▪ Understand the concept of data modeling ▪ Develop business rules ▪ Develop and apply good data naming conventions ▪ Construct simple data models using Entity Relationship Diagrams (ERDs) ▪ Develop entity relationships and define
Premium Entity-relationship model
4V of Big Data? Imagine all the information you alone generate each time you swipe your credit card‚ post to social media‚ drive your car‚ leave a voicemail‚ or visit a doctor. Now try to imagine your data combined with the data of all humans‚ corporations‚ and organizations in the world! From healthcare to social media‚ from business to the auto industry‚ humans are now creating more data than ever before. volume‚ velocity‚ variety‚ and veracity. Volume: Scale of Data Big data is big. It’s
Premium Internet Names of large numbers Computer
Be Data Literate – Know What to Know by Peter F. Drucker Executives have become computer literate. The younger ones‚ especially‚ know more about the way the computer works than they know about the mechanics of the automobile or the telephone. But not many executives are information-literate. They know how to get data. But most still have to learn how to use data. Few executives yet know how to ask: What information do I need to do my job? When do I need it? In what
Premium Decision making Information systems Chief information officer
Data Mining Abdullah Alshawdhabi Coleman University Simply stated data mining refers to extracting or mining knowledge from large amounts of it. The term is actually a misnomer. Remember that the mining of gold from rocks or sand is referred to as gold mining rather than rock or sand mining. Thus‚ data mining should have been more appropriately named “knowledge mining from data‚” which is unfortunately somewhat long. Knowledge mining‚ a shorter term‚ may not
Premium Data mining
research because they allow the researchers to analyze empirical data needed to interpret the findings and draw conclusions based on the results of the research. According to Portney and Watkins (2009)‚ all studies require a description of subjects and responses that are obtained through measuring central tendency‚ so all studies use descriptive statistics to present an appropriate use of statistical tests and the validity of data interpretation. Although descriptive statistics do not allow general
Premium Normal distribution Standard deviation Mode
Data Processing Data Processing is the term generally used to describe what was done by large mainframe computers from the late 1940’s until the early 1980’s (and which continues to be done in most large organizations to a greater or lesser extent even today): large volumes of raw transaction data fed into programs that update a master file‚ with fixed-format reports written to paper. Number System A numeral system (or system of numeration) is a writing system for expressing numbers‚ that is
Premium Binary numeral system Decimal Numeral system
You Can Do With Data/The Information Architecture of an Organization What is the difference between data and information? Give examples. Data = discrete‚ unorganized‚ raw facts Quantity Sold‚ Course Enrollment‚ Customer Name‚ Discount‚ Star Rating. Information = transformation of those facts into meaning. Financial data (deposits)‚ daily loans. What is a transaction? Action performed in a database management system What are the characteristics of an operational data store? Stores
Premium SQL Database management system Entity-relationship model
you an understanding of how data resources are managed in information systems by analyzing the managerial implications of basic concept and applications of database management. Introduce the concept of data resource management and stresses the advantages of the database management approach. It also stresses the role of database management system software and the database administration function. Finally‚ it outlines several major managerial considerations of data resource management.
Premium Database model Database SQL
while staying rooted in the taste legacy of Colonel Harland Sander’s secret recipe. Products are made on the motto of“Crispy outside‚ juicy inside” . In India‚ KFC is growing rapidly and today has presence in 21 * 2. HISTORICALBACKGROUND…… In 1930’s Colonel Harland Sanders some distinguished Kentucky folks licking’ their fingers. Founder of the original Kentucky Fried Chicken‚ was born on September 9‚ 1890. By 1964‚ Colonel had 600 franchise outlets for his chicken across the United States and
Premium KFC Fast food Fast food restaurant