Preview

Overview of the Data Mining

Good Essays
Open Document
Open Document
8497 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Overview of the Data Mining
Order Code RL31798

CRS Report for Congress
Received through the CRS Web

Data Mining: An Overview

Updated December 16, 2004

Jeffrey W. Seifert Analyst in Information Science and Technology Policy Resources, Science, and Industry Division

Congressional Research Service ˜ The Library of Congress

Data Mining: An Overview
Summary
Data mining is emerging as one of the key features of many homeland security initiatives. Often used as a means for detecting fraud, assessing risk, and product retailing, data mining involves the use of data analysis tools to discover previously unknown, valid patterns and relationships in large data sets. In the context of homeland security, data mining is often viewed as a potential means to identify terrorist activities, such as money transfers and communications, and to identify and track individual terrorists themselves, such as through travel and immigration records. While data mining represents a significant advance in the type of analytical tools currently available, there are limitations to its capability. One limitation is that although data mining can help reveal patterns and relationships, it does not tell the user the value or significance of these patterns. These types of determinations must be made by the user. A second limitation is that while data mining can identify connections between behaviors and/or variables, it does not necessarily identify a causal relationship. To be successful, data mining still requires skilled technical and analytical specialists who can structure the analysis and interpret the output that is created. Data mining is becoming increasingly common in both the private and public sectors. Industries such as banking, insurance, medicine, and retailing commonly use data mining to reduce costs, enhance research, and increase sales. In the public sector, data mining applications initially were used as a means to detect fraud and waste, but have grown to also be used for purposes such as

You May Also Find These Documents Helpful

  • Good Essays

    Data Mining Soltions

    • 1720 Words
    • 7 Pages

    Question 1: Assume a base cuboid of 10 dimensions contains only three base cells: (1) (a1, b2, c3, d4; ..., d9, d10), (2) (a1, c2, b3, d4, ..., d9, d10), and (3) (b1, c2, b3, d4, ..., d9, d10), where a_i != b_i, b_i != c_i, etc. The measure of the cube is count. 1, How many nonempty cuboids will a full data cube contain? Answer: 210 = 1024 2, How many nonempty aggregate (i.e., non-base) cells will a full cube contain? Answer: There will be 3 ∗ 210 − 6 ∗ 27 − 3 = 2301 nonempty aggregate cells in the full cube. The number of cells overlapping twice is 27 while the number of cells overlapping once is 4 ∗ 27 . So the final calculation is 3 ∗ 210 − 2 ∗ 27 − 1 ∗ 4 ∗ 27 − 3, which yields the result. 3, How many nonempty aggregate cells will an iceberg cube contain if the condition of the 4, iceberg cube is "count >= 2"? Answer: There are in total 5 ∗ 27 = 640 nonempty aggregate cells in the iceberg cube. To calculate the result: fix the first three dimensions as (***), (a1**), (*c1*), (**b3) or (*c1b3), and vary the rest seven ones. 4, How many closed cells are in the full cube? Answer: There’re 6 closed cells in the full cube: 3 base cells; (a1, *, *, d4, …, d10); (*, c2, b3, d4, …, d10) : count 2; (*, *, *, d4, .., d10): count 3. Question 2: (Half open questions, make sure your algorithm and assumptions are correct, no need to be very specific) Suppose a base cuboid has the following tuples:…

    • 1720 Words
    • 7 Pages
    Good Essays
  • Satisfactory Essays

    In this report, the Committee on Technical and Privacy Dimensions of Information for Terrorism Prevention and Other National Goals examines behavioral surveillance technologies in Counterterrorism programs and make decisions about deploying and evaluating those and other information programs of their effectiveness and risk to personal privacy. Modern data…

    • 432 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    1984 - Reflection Paper

    • 694 Words
    • 3 Pages

    Since Orwell's book 1984 written in 1948, we have developed methods to produce more advanced and less costly computer technology. Value Added Networks continue to rise in popularity. Data warehousing (information availability) and data mining (information analysis) have become hot topics in today's world. Personal data that has always been available, but not easily accessible, is now computerized and merged with larger databases. These databases are linked to form massive data repositories. This practice is not limited to the private sector; government databases such as the Department of Motor Vehicles and criminal records are accessible to those willing to pay for access. The ability to desegregate personal information and profile individuals is easier than ever.…

    • 694 Words
    • 3 Pages
    Good Essays
  • Best Essays

    Digital Forensics

    • 1977 Words
    • 8 Pages

    The rapid growth of the internet has made it easier to commit traditional crimes by providing criminals an alternate method for launching attacks with relative anonymity. Effects of such technology has been great but , with the ever changing complexity of the communication and networking infrastructure is making investigation of the crimes difficult. Clues to solving a case might be hidden in large volumes of data that needs to be sifted through in order to detect crimes and collect evidence.…

    • 1977 Words
    • 8 Pages
    Best Essays
  • Powerful Essays

    Data Mining Problems

    • 1295 Words
    • 6 Pages

    Suppose that we are responsible for managing product placement within a local supermarket. Our shelving units have 6 shelves each and are numbered from 1 to 6—with 1 being the lowest shelf and proceeding upward until the highest shelf is assigned the number 6. While there are many placement options that we should consider, we decide to look for any correlations between the row a product is placed on and its sales. Since we have our data stored in a data warehouse, it is easily accessible and responds quickly to our data request. Consider each of the following:…

    • 1295 Words
    • 6 Pages
    Powerful Essays
  • Good Essays

    CaseEF GroupC2 Team10

    • 1421 Words
    • 4 Pages

    Although the Tucson data-mining project may inappropriately violate the privacy of Internet, it is an acceptable tradeoff to more intelligently combat terrorism users because it is so far, one of the best way.…

    • 1421 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    Data Mining

    • 3792 Words
    • 16 Pages

    Data mining, or knowledge discovery, is the computer-assisted process of digging through and analyzing enormous sets of data and then extracting the meaning of the data. Data mining tools predict behaviors and future trends, allowing businesses to make proactive, knowledge-driven decisions. Data mining tools can answer business questions that traditionally were too time consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations.…

    • 3792 Words
    • 16 Pages
    Powerful Essays
  • Satisfactory Essays

    Journal of the American Society for Information Science & Technology; Mar2009, Vol. 60 Issue 3, p443-454, 12p, 6 Charts…

    • 335 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    A business partner, Steve, and I are talking about starting a small, brick and mortar, nostalgic record store. Steve does not have much experience with information systems or technology. A basic understanding of the different types of information systems available for the business to use can be helpful in him gaining experience. To help my partner become familiar with the systems, providing him with information of each system will provide Steve with insight on what to expect. Various types of information systems are helpful, and many could work, but each one comes with benefits as well as drawbacks.…

    • 391 Words
    • 2 Pages
    Satisfactory Essays
  • Better Essays

    Data Mining

    • 11085 Words
    • 45 Pages

    Abstract Steganography is one of the methods used for the hidden exchange of information and it can be defined as the study of invisible communication that usually deals with the ways of hiding the existence of the communicated message. In this way, if successfully it is achieved, the message does not attract attention from eavesdroppers and attackers. Using steganography, information can be hidden in different embedding mediums, known as carriers. These carriers can be images, audio files, video files, and text files. The focus in this paper is on the use of an image file as a carrier, and hence, the taxonomy of current steganographic techniques for image files has been presented. These techniques are analyzed and discussed not only in terms of their ability to hide information in image files but also according to how much information can be hidden, and the robustness to different image processing attacks. Keywords: Adaptive Steganography, Current Techniques, Image Files, Overview, Steganography, Taxonomy.…

    • 11085 Words
    • 45 Pages
    Better Essays
  • Good Essays

    Week 4. Team Reflection

    • 609 Words
    • 2 Pages

    Team A’s members range from a project manager who attained their Associate Degree in Computer Programming, a logistics specialists for Amazon.com, a employee in the Hilton Hotel industry, and a wine educator working in hospitality—all very different fields with varying levels of information systems background. While for some it was a review, Team A can all agree that each and every one of us gained a better understanding on how and why information systems accomplish business objectives. Cheryl knew the degree in which wireless technologies kept users plugged into the World Wide Web. She was aware that smartphones and their many accessories allowed users to access their emails, schedules, mobile banking and participate in e-commerce as well as make online payments—she learned that M-commerce another growing trend. Due to telemedicine, modern technology has allowed the medical world to provide assistance via videoconferencing. In addition, she learned that setting up and using access points to create meshed networks called a Wide Area Network (WAN) (Rainer & Cegielski, 2011). Xavier learned the relevance of wireless technology in everyday life. More specifically, he learned of the different functions of varying satellite types to communicate information. Kelly learned about the two basic operations of data mining. According to Rainer and Cegielski (2011), data mining functions in “predicting trends and behaviors and identifying previously unknown patterns”…

    • 609 Words
    • 2 Pages
    Good Essays
  • Satisfactory Essays

    analytics to make better security decisions, as well as understand the forces that shape the security…

    • 527 Words
    • 1 Page
    Satisfactory Essays
  • Good Essays

    Data Mining: Introduction Lecture Notes for Chapter 1 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Why Mine Data? Commercial Viewpoint O Lots of data is being collected and warehoused – Web data, e-commerce – purchases at department/ grocery stores – Bank/Credit Card transactions…

    • 2304 Words
    • 32 Pages
    Good Essays
  • Powerful Essays

    Data Mining

    • 7962 Words
    • 32 Pages

    [10] Gell-Mann M., What is complexity? , Complexity, Vol 1, No 1, pp 16-19, 1995. [11] Shalizi C., Complexity Measures, available on http://cscs.umich.edu/~crshalizi/notebooks/complexity-measures.html, 2003. [12] Crutchfield J. P., Shalizi C., Thermodynamic Depth of Causal States: When Paddling around in Occam’s Pool Shallowness Is a Virtue, Santa Fe Insitute Working Paper 98-06-047, 1998. [13] Crutchfield J. P., Complexity: Order contra Chaos, in “Handbook of Metaphysics and Ontology”, Philosophia Verlag, Munich, 1989. [14] Gell-Mann M., Crutchfield J. P., Computation in Physical and Biological Systems Measures of Complexity, available on http://www.santafe.edu/research/measuringComplexity.php, 2004. [15] Calinescu A., Efstathiou J., Sivadasan S., Schirn J., Huaccho Huatuco L., Complexity in Manufacturing: An Information Theoretic Approach, Proceedings of the International Conference on Complex Systems and Complexity in Manufacturing, Warwick, pp 30-44, 2000. [16] Schuster P., How does complexity arise in evolution? , Complexity, Vol 2, pp 22-30, 1996. [17] Teece, D.J., Pisano, G. and Shuen, A., Dynamic capabilities and strategic management, Strategic Management Journal, Vol. 18 No. 7, pp. 509-33, 1997 [18] Westhoff F., Yarbrough B., Yarbrough R., Complexity, Organisation and Stuart Kauffman 's "The Origins of Order", Journal of Economic Behaviour and Organisation, Vol 29, No 1, pp 1-25, 1996. [19] Kauffman S., Escaping the Red Queen effect, The McKinsey Quarterly, Vol 1, pp 118-129, 1995. [20] Deshmukh A., Talavage J., Barash M., Complexity in manufacturing systems, Part 1: Analysis of static complexity, IIE Transactions, Vol 30, pp 645-655, 1998. 19…

    • 7962 Words
    • 32 Pages
    Powerful Essays
  • Powerful Essays

    Data Warehouses and Data Marts: A Dynamic View By Joseph M. Firestone, Ph.D. White Paper No. Three March 27, 1997…

    • 5149 Words
    • 21 Pages
    Powerful Essays