Data warehousing is the process of collecting data in raw form for analyzing trends. The benefits to data warehousing are improved end-user access, increased data consistency, various kinds of reports can be made from the data collected, gather the data in a common place from separate sources and additional documentation of data. Potential lower computing costs, increased productivity, end-users can query the database without using overhead of the operational systems and creates an infrastructure that can be used when changing systems, these are some more advantages to data warehousing
Data mining is running algorithms to discover trends and other useful information. It can also classify items, indemnify an event and optimize the use of resources. Data mining can be used as an extremely effective tool if we use it properly. Some benefits to data mining include increased profits, less future business mistakes, less losses and knowing what is trending. If we were to know how much hot chocolate was sold in the winter and the summer, we could lower the losses of expired hot chocolate during the summer months and have enough for the winter months.
Several different industries use data warehousing and data mining. Our on government uses it to develop trends of financial transactions to detect money laundering. Google uses these as well. Every time I go to Google a record of my searches is kept. Google uses the searches and the links you click on to determine the order of the search results. They also use an auto word function, that is type in part of a phrase and Google will finish it for you depending on the trend.
Data warehouses have a basic design with some variations. Data sources, warehouse and users make up the architect of the basic data warehouse. In our system we will be adding data marts and a staging area. The warehouse is where the data is kept. It consists of metadata, raw data and summary data. Metadata is data about the data in the warehouse, for example amount of data or types of data. Raw data is the data retrieved. “Summary Data - the data that has been aggregated or transformed from the atomic level data. Summary data may reside in all of the database objects of the data warehouse”. (University Information Services, n.d.) Summary data is useful because it computes long operations in advance.
The users are the end-users. These are the people mining the warehouse, querying reports and analyzing the data. Data marts are placed between the users and the warehouse to help the users perform their jobs more efficiently. “Data Mart - a data warehouse data class organized for a business functional area or department. The database contains data summarized at multiple levels of granularity and may be designed using relational or multidimensional database structures”. (University Information Services, n.d.) We will have purchasing, sales and inventory data marts.
Data sources are operational systems or third-party companies or another system that provides data for the warehouse. Located between the data source and the warehouse will be the staging area. A staging area is” a place where data is processed before entering the warehouse”. (Lane, 2009) Staging area simplifies in building the summary data and general warehouse management.
In order to build the model of our data base warehouse we need to look at numerous items. First we should remove all operational data, which is used only for system programs. Only data that will or can be used for the purpose of the warehouse. Next we set up a seven year time element. Any data more than seven years old will be deleted. Seven years is enough information to develop trends and not waste space on the database and keeps plenty of data for taxes. We will decide which derived data will be needed. Any data that will be used repeatedly will be kept in order to lower the overhead of the system’s resources. We will have relationship...
Bibliography: Information Services Data Warehouse. (n.d.). Retrieved 3 16, 2013, from Washington State University: http://infotech.wsu.edu/datawarehouse/
Lane, P. (2009, August). Oracle Database Data Warehousing Guide 11g Release 2. Retrieved from Oracle Database Documentation Library: http://docs.oracle.com/cd/E14072_01/server.112/e10810/title.htm
Noton, A. (2013, 3 16). The Benefits of Data Mining. Retrieved from Ezine@rticles: http://ezinearticles.com/?The-Benefits-of-Data-Mining&id=4565509
Ricardo, C. (2012). Databases Illuminated. Sudsbury: Jones & Bartlett Learning.
University Information Services. (n.d.). Retrieved from Georgetown Unitversity: http://uis.georgetown.edu/departments/eets/dw/GLOSSARY0816.html#S
Please join StudyMode to read the full document