A Review of Frequent Itemsets over Data Stream based on Data Mining Techniques
Fayyaz Ahmed, Irfan Khan
Department of Computer Science
Comsats Institute of Science & Technology
Data stream is a continuous, unbounded and high speed of data. Stream data arrives from different distributed areas. It is impossible to store all data in active storage. Now a day’s mining data stream is a challenging task for the purpose of KKD, fraud detection, trend learning, transaction prediction, network monitoring, online transactions mining and estimation etc for finding different itemsets. Data in streams are newly arrived with time advancement. Such data is necessary to scan only once, consume limited storage and response in real time. This paper is about the review of mining frequent itemsets, closed frequent itemsets, closed weighted frequent patterns, maximal frequent itemsets, online frequent itemsets, online clustering, transient patterns, frequent sequential patterns using different models and techniques to mine such itemsets over data stream. The comprehensive and theoretical review of mining different itemsets over data stream provide base for work in future. This review shows that the models & techniques used like FP-growth, decision tree, appriori,VALWIN, Top-K, Max-FISM, WSW, HCFI, MAIDS and many others for mining data stream is used as primary solution to the problems occurring in mining different itemsets.
frequent itemsets, closed frequent itemsets, closed weighted frequent patterns, maximal frequent itemsets, online frequent itemsets, online clustering, transient patterns, frequent sequential patterns, FP-growth, decision tree, appriori,VALWIN, Top-K, Max-FISM, WSW, HCFI, MAIDS, Sliding window model, Landmark window model, time-fading model, tilted model, stream mining.
Data coming in continuously from different area with a high speed and massive size is called data stream. Storing this data overall is too costly. In real time response is needed for mining itemsets over data stream. Mining itemsets over data stream
made this task much challenging and necessary for fraud detection in stream, taking out knowledge, for business improvement etc.
Mining Frequent Itemsets [1, 5, 8, 9, 10, 14, 15, 16, 17, 18, 19, 21, 25] over stream of data is the most important and challenging task. Different authors have proposed different techniques to mine frequent itemsets. K Jothimani & S. Antony SelvadosThnmani  proposed VALWIN (Varying length sliding window algorithm) for mining Frequent Itemsets in continuous online stream of data items using sliding window model. Pauray S.M. Tsai  proposed a new frame work to find out frequent itemsets in a continuous, unbounded and high speed of data stream giving name to the algorithm WSW using weighted sliding window model. Mahmood Deypir & Mohammad  proposed an algorithm named as “FP-Growth algorithm” using “Prefix Tree” data structure and sliding window model for mining frequent itemsets over data stream. William Cheung and Osmar R. Zaïane  proposed a new data structure called CATS Tree and an algorithm CATS Tree Builder. K Jothimani, S. Antony Selvadoss Thanmani  introduced frequency measurement method MS which is based on a “variable window length” and proposed an incremental algorithm. Shaik.Hafija, J.V.R.Murthy, Y.Anuradha & M.Chandra  proposed a new method which divided the data into number of windows uses a DP-Tree compact structure of this data. Haifeng Li & Hong Chen  proposed an algorithm for mining “non-derivable frequent itemsets” named as NDFIoDS by using sliding window. Mozafari, H. Thakkar, & C. Zaniolo  proposed a new verifying algorithm for the performance improvement of mining and monitoring tasks for “association rules” using sliding window to mine frequent itemsets. Toon Calders, Nele Dexters & Bart Goethals  proposed an “incremental algorithm” which...
Please join StudyMode to read the full document