Data mining and OLAP are the most common Business Intelligence technologies. The term Business Intelligence refers to computer based methods to identify and extract useful information from business data. Online Analytical Processing commonly known as OLAP provides summary data and generates rich calculations. OLAP is a class of systems that provide answers to multidimensional queries. OLAP is typically used in business reporting for sales, marketing and various such domains. OLAP enables the users to view the data interactively from multiple perspectives.
On the other hand, Data mining helps discover hidden pattern or trend in data to support a conclusion. As the name suggests, unlike OLAP that operates at a summary view, Data mining operates at detail level. For instance, if walmart would like to identify the trend of products sold during a holiday season, data mining would help them answer that question based on historic data.
Although, OLAP and data mining operate on data to gain intelligence, the main difference lies on how they operate on the data. OLAP tools provide multidimensional data analysis and summaries of the data. On the contrary data mining focuses on ratios, patterns and influences in the set of data. OLAP and data mining can complement each other. OLAP might point out problems with sales of a specific product for walmart for this month in particular region. Data mining can be used to gain an insight about the behavior of customers in the region. Data mining can predict such as 5% increase in sale.
Data mining can be used to identify the most important attributes concerning sales and those attributes could be used to design the data model in OLAP. We can divide IT systems into transactional (OLTP) and analytical (OLAP).
OLTP stands for On-line Transaction Processing
OLAP stands for On-line Analytical Processing
OLTP tables are highly normalized
OLAP tables are generally de-normalized with fewer tables
OLTP comprises of Operational data. This is the original source of the data OLAP contains consolidation data. The source for OLAP data is various OLTP databases Typically very fast processing speed
Depends on the amount of data involved
To control and run fundamental business tasks
To help with planning, solving and decision support
Reveals a snapshot of ongoing business processes
Multidimensional views of various kinds of business activities Characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE) Characterized by relatively low volume of transaction. Queries are often very complex and involve aggregation Relatively small space requirement if historical data is archived Large space requirement due to existence of aggregation structures and history data; requires more indexes than OLTP
Consistency, reliability and accuracy of data in a relational database is achieved using relational integrity. This defines a set of rules that enforce a basic fundamental concept of existence of data and a relationship between the data by use of primary and foreign keys.
Relational integrity constraint can be mainly categorized into entity integrity and referential integrity. The basis of these integrities is primary keys for entity integrity and foreign key for referential integrity. The primary key in a table uniquely identifies the records in the table and the foreign key in one table references the primary key of another table.
This is the mechanism the system provides to maintain primary keys. Primary key is a unique identifier for tuples in an entity. It ensures two properties for primary keys, Primary key values for a table should be unique and it should not match the key value of another row in the table Primary key should not take in null values. No value of primary key should be set to null The entity integrity constraint ensures that each value in the primary key field uniquely identifies the row in that...
Please join StudyMode to read the full document