DATA WAREHOUSES & DATA MINING
Management Support System
Collection of integrated, subject-oriented, time-variant and non-volatile data in support of managements decision making process.
Described as the "single point of truth", the "corporate memory", the sole historical register of virtually all transactions that occur in the life of an organization.
A fundamental concept of a data warehouse is the distinction between data and information. Data is composed of observable and recordable facts that are often found in operational or transactional systems. At Rutgers, these systems include the registrar’s data on students (widely known as the SRDB), human resource and payroll databases, course scheduling data, and data on financial aid.
In a data warehouse environment, data only comes to have value to end-users when it is organized and presented as information.
Information is an integrated collection of facts and is used as the basis for decision-making. For example, an academic unit needs to have diachronic information about its extent of instructional output of its different faculty members to gauge if it is becoming more or less reliant on part-time faculty.
“The data warehouse is always a physically separate store of data transformed from the application data found in the operational environment”.
Data entering the data warehouse comes from operational environment in almost every case.Data warehousing provides architectures and tools for business executives to syste-matically organize ,understand ,and use their data to make stragetic decisions.A large number of organizations have found that data warehouse systems are valuable tools in today’s competive,fast-evolving world. In the last several years ,many firms have spent millions of dollars in building enterprise wide data warehouses. Many people feel that with competition mounting in every industry ,data warehousing is the latest must have marketing weapon –a way to keep customers by learning more about their needs.
Data warehouses have been defined in many ways,making it difficult to formulate a rigorous definition. Loosely speaking , a data warehouse refers to a database that is maintened separately from an organization,s operational databases. Data warehouse systems allow for integration of a variety of applications systems . They support information processing by providing a solid platform of consolidated historical data for analysis.
Data warehousing is a more formalised methodology of these techniques. For example, many sales analysis systems and executive information systems (EIS) get their data from summary files rather then operational transaction files. The method
of using summary files instead of operational data is in essence what data warehousing is allabout.Some data warehousing tools neglect the importance of modelling and building a datawarehouse and focus on the storage and retrieval of data only. These tools might havestrong analytical facilities, but lack the qualities you need to build and maintain a corporatewide data warehouse. These tools belong on the PC rather than the host.Your corporate wide (or division wide) data warehouse needs to be scalable, secure, openand, above all, suitable for publication.
NEED OF DATA WAREHOUSE :-
Missing data: Decision support requires historical data which operational DBs do not typically maintain Data Consolidation: DS requires consolidation (aggregation, summarization) of data from heterogeneous sources: operational DBs, external sources Data quality: Different sources typically use inconsistent data representations, codes and formats which have to be reconciled....
Please join StudyMode to read the full document