Student

Only available on StudyMode
  • Download(s) : 68
  • Published : April 23, 2013
Open Document
Text Preview
A Review of Frequent Itemsets over Data Stream based on Data Mining Techniques

Fayyaz Ahmed, Irfan Khan
Department of Computer Science
Comsats Institute of Science & Technology

ABSTRACT
Data stream is a continuous, unbounded and high speed of data. Stream data arrives from different distributed areas. It is impossible to store all data in active storage. Now a day’s mining data stream is a challenging task for the purpose of KKD, fraud detection, trend learning, transaction prediction, network monitoring, online transactions mining and estimation etc for finding different itemsets. Data in streams are newly arrived with time advancement. Such data is necessary to scan only once, consume limited storage and response in real time. This paper is about the review of mining frequent itemsets, closed frequent itemsets, closed weighted frequent patterns, maximal frequent itemsets, online frequent itemsets, online clustering, transient patterns, frequent sequential patterns using different models and techniques to mine such itemsets over data stream. The comprehensive and theoretical review of mining different itemsets over data stream provide base for work in future. This review shows that the models & techniques used like FP-growth, decision tree, appriori,VALWIN, Top-K, Max-FISM, WSW, HCFI, MAIDS and many others for mining data stream is used as primary solution to the problems occurring in mining different itemsets.

Keywords
frequent itemsets, closed frequent itemsets, closed weighted frequent patterns, maximal frequent itemsets, online frequent itemsets, online clustering, transient patterns, frequent sequential patterns, FP-growth, decision tree, appriori,VALWIN, Top-K, Max-FISM, WSW, HCFI, MAIDS, Sliding window model, Landmark window model, time-fading model, tilted model, stream mining.

1. INTRODUCTION
Data coming in continuously from different area with a high speed and massive size is called data stream. Storing this data overall is too costly. In real time response is needed for mining itemsets over data stream. Mining itemsets over data stream

made this task much challenging and necessary for fraud detection in stream, taking out knowledge, for business improvement etc.

Mining Frequent Itemsets [1, 5, 8, 9, 10, 14, 15, 16, 17, 18, 19, 21, 25] over stream of data is the most important and challenging task. Different authors have proposed different techniques to mine frequent itemsets. K Jothimani & S. Antony SelvadosThnmani [1] proposed VALWIN (Varying length sliding window algorithm) for mining Frequent Itemsets in continuous online stream of data items using sliding window model. Pauray S.M. Tsai [5] proposed a new frame work to find out frequent itemsets in a continuous, unbounded and high speed of data stream giving name to the algorithm WSW using weighted sliding window model. Mahmood Deypir & Mohammad [9] proposed an algorithm named as “FP-Growth algorithm” using “Prefix Tree” data structure and sliding window model for mining frequent itemsets over data stream. William Cheung and Osmar R. Zaïane [10] proposed a new data structure called CATS Tree and an algorithm CATS Tree Builder. K Jothimani, S. Antony Selvadoss Thanmani [14] introduced frequency measurement method MS which is based on a “variable window length” and proposed an incremental algorithm. Shaik.Hafija, J.V.R.Murthy, Y.Anuradha & M.Chandra [15] proposed a new method which divided the data into number of windows uses a DP-Tree compact structure of this data. Haifeng Li & Hong Chen [16] proposed an algorithm for mining “non-derivable frequent itemsets” named as NDFIoDS by using sliding window. Mozafari, H. Thakkar, & C. Zaniolo [17] proposed a new verifying algorithm for the performance improvement of mining and monitoring tasks for “association rules” using sliding window to mine frequent itemsets. Toon Calders, Nele Dexters & Bart Goethals [18] proposed an “incremental algorithm” which...
tracking img