Preview

Top Ten Algorithms

Powerful Essays
Open Document
Open Document
18870 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Top Ten Algorithms
Knowl Inf Syst (2008) 14:1–37 DOI 10.1007/s10115-007-0114-2 SURVEY PAPER

Top 10 algorithms in data mining
Xindong Wu · Vipin Kumar · J. Ross Quinlan · Joydeep Ghosh · Qiang Yang · Hiroshi Motoda · Geoffrey J. McLachlan · Angus Ng · Bing Liu · Philip S. Yu · Zhi-Hua Zhou · Michael Steinbach · David J. Hand · Dan Steinberg

Received: 9 July 2007 / Revised: 28 September 2007 / Accepted: 8 October 2007 Published online: 4 December 2007 © Springer-Verlag London Limited 2007

Abstract This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification,

X. Wu (B ) Department of Computer Science, University of Vermont, Burlington, VT, USA e-mail: xwu@cs.uvm.edu V. Kumar Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA e-mail: kumar@cs.umn.edu J. Ross Quinlan Rulequest Research Pty Ltd, St Ives, NSW, Australia e-mail: quinlan@rulequest.com J. Ghosh Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX 78712, USA e-mail: ghosh@ece.utexas.edu Q. Yang Department of Computer Science, Hong Kong University of Science and Technology, Honkong, China e-mail: qyang@cs.ust.hk H. Motoda AFOSR/AOARD and Osaka University, 7-23-17 Roppongi, Minato-ku, Tokyo 106-0032, Japan e-mail: motoda@ar.sanken.osaka-u.ac.jp

123

2

X. Wu et al.

clustering, statistical learning, association analysis, and link mining, which are all among the most important topics in data mining research and development. 0 Introduction In an effort to identify some of

You May Also Find These Documents Helpful

  • Good Essays

    The data mining model chosen for this project is the Naïve Bayes classification model. This…

    • 642 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    aspirin

    • 286 Words
    • 2 Pages

    Kahlin Cheung-Ong, Kyung Tae Song, Zhidong Ma, Daniel Shabtai, Anna Y. Lee, David Gallo, Lawrence E. Heisler, Grant W. Brown, Ulrich Bierbach, Guri Giaever, and Corey Nislow…

    • 286 Words
    • 2 Pages
    Satisfactory Essays
  • Best Essays

    It Essay - Data Mining

    • 1998 Words
    • 8 Pages

    He, J. (2009). Advances in Data Mining: History and Future. Third International Symposium on Intelligent . Retrieved November 1, 2012, from http://ieeexplore.ieee.org.ezproxy.lib.ryerson.ca/stamp/stamp.jsp?tp=&arnumber=5370232&tag=1…

    • 1998 Words
    • 8 Pages
    Best Essays
  • Powerful Essays

    usinesses face challenges such as growth, regulations, globalization, mergers and acquisitions, competition, and economic changes, which require fast and good decisions rather than guess work. Taking good decisions requires accurate and clear analysis such as prediction, estimation, classification, or segmentation using data mining techniques. Decision tree induction and Clustering are two of the most important data mining techniques that find interesting patterns. There are many commercial data mining software in the market, and most of them provide decision trees induction and clustering data mining techniques. There is no doubt that commercial data mining software are expensive and costly, and choosing one of them is crucial and difficult decision. Therefore, this paper objective is to help…

    • 6624 Words
    • 27 Pages
    Powerful Essays
  • Powerful Essays

    Data Mining

    • 1921 Words
    • 8 Pages

    Patterson, L. (2010, APR 27). The nine most common data mining techniques used in predictive…

    • 1921 Words
    • 8 Pages
    Powerful Essays
  • Better Essays

    Department of Information and Computer Sciences, Saitama University, 255 Shimo-Okubo, Urawa-shi, Saitama 338-8570, Japan Received 13 March 2000; received in revised form 19 June 2000 Communicated by K. Iwama…

    • 1847 Words
    • 8 Pages
    Better Essays
  • Powerful Essays

    Received: 5 September 2010 / Revised: 13 January 2011 Accepted: 29 April 2011 / Published online: 31 May 2011 # The Author(s) 2011. This article is published with open access at Springerlink.com…

    • 7780 Words
    • 32 Pages
    Powerful Essays
  • Satisfactory Essays

    data mining hw 3

    • 505 Words
    • 3 Pages

    for kNN, k is a parameter. You need to report two result, k =1 and…

    • 505 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    Decision

    • 647 Words
    • 3 Pages

    Given a data-mining problem, you need to have data that represent the problem, models that are suitable for the data, and of course a data-mining environment that contains the algorithms capable of learning these models. In this lab you will study two well-known classification problems. You will try to find classification models for these problems using decision trees and decision rules. The algorithms to learn these models are given in Weka, a data-mining environment that accompanies our course. You will study the explorer part of Weka to learn how to call decision-tree and decision-rule algorithms, how to evaluate the accuracy of the learned models, and how to use reduced error pruning.…

    • 647 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    Data Mining

    • 2278 Words
    • 10 Pages

    Stock market data analysis needs the help of software intelligence and data mining pointers. The stock prices depend on gains and losses of certain publicly traded companies and political and economical events. Many people consider the stock Market prediction as gambling. Many stock brokers are unaware of the fact it is possible to generate constructive patterns by the analysis of stock prices. Data mining techniques can be applied on past and present financial data to generate patterns and make…

    • 2278 Words
    • 10 Pages
    Satisfactory Essays
  • Powerful Essays

    mining, and Web page categorization—that bring order to the massive amount of distributed Web content. Due to the overwhelming…

    • 13573 Words
    • 55 Pages
    Powerful Essays
  • Powerful Essays

    Wsn Energy Efficient Routing

    • 9283 Words
    • 38 Pages

    Hieu Khac Le, Dan Henriksson, and Tarek Abdelzaher Department of Computer Science, University of Illinois at Urbana-Champaign 201 N Goodwin Ave., Urbana, IL 61801 {hieule2, danhenr, zaher}@cs.uiuc.edu Abstract…

    • 9283 Words
    • 38 Pages
    Powerful Essays
  • Better Essays

    DATA MANAGEMENT

    • 887 Words
    • 3 Pages

    Data mining is the process of analyzing data from different perceptions and summarizing it into useful evidence that can be used to increase revenue, cut costs or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it and summarize the relationships identified. Association, Clustering, predictions and sequential patterns, decision trees and classification are the data mining techniques. It is a promising and relatively new technology. Data mining is defined as a process of discovering hidden valuable knowledge of analyzing large amounts of data, which is stored in databases or data warehouse, using various data mining techniques such as machine learning, artificial intelligence and statistical.…

    • 887 Words
    • 3 Pages
    Better Essays
  • Powerful Essays

    Digital Jewelry

    • 10165 Words
    • 41 Pages

    [Jain and Dubes, 1988] gives a nice outline for the various steps involved in any clustering algorithm…

    • 10165 Words
    • 41 Pages
    Powerful Essays
  • Powerful Essays

    This paper deals with data mining process, more specifically with knowledge discovery. Notwithstanding, discovering applicable patterns, tendency, principles, relationships and deviations in great amounts of data, and making significant forecasts form it, yet, remains one of the primary challenges of the information era.…

    • 2290 Words
    • 10 Pages
    Powerful Essays