Preview

Overview of the Data Mining

Good Essays
Open Document
Open Document
8497 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Overview of the Data Mining
Order Code RL31798

CRS Report for Congress
Received through the CRS Web

Data Mining: An Overview

Updated December 16, 2004

Jeffrey W. Seifert Analyst in Information Science and Technology Policy Resources, Science, and Industry Division

Congressional Research Service ˜ The Library of Congress

Data Mining: An Overview
Summary
Data mining is emerging as one of the key features of many homeland security initiatives. Often used as a means for detecting fraud, assessing risk, and product retailing, data mining involves the use of data analysis tools to discover previously unknown, valid patterns and relationships in large data sets. In the context of homeland security, data mining is often viewed as a potential means to identify terrorist activities, such as money transfers and communications, and to identify and track individual terrorists themselves, such as through travel and immigration records. While data mining represents a significant advance in the type of analytical tools currently available, there are limitations to its capability. One limitation is that although data mining can help reveal patterns and relationships, it does not tell the user the value or significance of these patterns. These types of determinations must be made by the user. A second limitation is that while data mining can identify connections between behaviors and/or variables, it does not necessarily identify a causal relationship. To be successful, data mining still requires skilled technical and analytical specialists who can structure the analysis and interpret the output that is created. Data mining is becoming increasingly common in both the private and public sectors. Industries such as banking, insurance, medicine, and retailing commonly use data mining to reduce costs, enhance research, and increase sales. In the public sector, data mining applications initially were used as a means to detect fraud and waste, but have grown to also be used for purposes such as

You May Also Find These Documents Helpful

  • Good Essays

    In an effort to maintain the security of our nation, the Department of Homeland Security has developed a system called the National Terrorism Advisory System that releases security threat updates that can be easily accessible to other departments, private organizations, and even the public. However, before the National Terrorism Advisory System, or NTAS, there was the Homeland Security Advisory System, or HSAS, which was a color-coded advisory system that correlated the threat level to that of green, blue, yellow, orange, and red. In this paper, the author will further explain the two systems as well as explain the differences in the systems and why there was a change. The author will begin with discussing the Homeland Security Advisory System.…

    • 924 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    This module will examine the importance of criminal data and its effect on the criminal justice system. For instance, is it important for a law enforcement agency to evaluate the crimes occurring in their city or jurisdiction? Is it important for citizens to know how safe is the area in which they live? If so, how is that information gathered and disseminated to the general public? How does the law enforcement component of the criminal justice system use the information to reduce crime or even predict it in the future? With the advent of applicable technology, law enforcement agencies and criminologists are now examining crime patterns, suspect information, as well as date and time of crimes in an effort to predict probable occurrences and locations of future crimes.…

    • 1518 Words
    • 7 Pages
    Powerful Essays
  • Good Essays

    Data Mining Soltions

    • 1720 Words
    • 7 Pages

    Question 1: Assume a base cuboid of 10 dimensions contains only three base cells: (1) (a1, b2, c3, d4; ..., d9, d10), (2) (a1, c2, b3, d4, ..., d9, d10), and (3) (b1, c2, b3, d4, ..., d9, d10), where a_i != b_i, b_i != c_i, etc. The measure of the cube is count. 1, How many nonempty cuboids will a full data cube contain? Answer: 210 = 1024 2, How many nonempty aggregate (i.e., non-base) cells will a full cube contain? Answer: There will be 3 ∗ 210 − 6 ∗ 27 − 3 = 2301 nonempty aggregate cells in the full cube. The number of cells overlapping twice is 27 while the number of cells overlapping once is 4 ∗ 27 . So the final calculation is 3 ∗ 210 − 2 ∗ 27 − 1 ∗ 4 ∗ 27 − 3, which yields the result. 3, How many nonempty aggregate cells will an iceberg cube contain if the condition of the 4, iceberg cube is "count >= 2"? Answer: There are in total 5 ∗ 27 = 640 nonempty aggregate cells in the iceberg cube. To calculate the result: fix the first three dimensions as (***), (a1**), (*c1*), (**b3) or (*c1b3), and vary the rest seven ones. 4, How many closed cells are in the full cube? Answer: There’re 6 closed cells in the full cube: 3 base cells; (a1, *, *, d4, …, d10); (*, c2, b3, d4, …, d10) : count 2; (*, *, *, d4, .., d10): count 3. Question 2: (Half open questions, make sure your algorithm and assumptions are correct, no need to be very specific) Suppose a base cuboid has the following tuples:…

    • 1720 Words
    • 7 Pages
    Good Essays
  • Good Essays

    The creation of a national database applied by the Homeland Security Department would permit the states to communicate and distribute intelligence collected on different terrorists and criminal behaviors. Following the catastrophic events of September 11, 2001, just about every state implemented fusion centers to share intelligence on terrorist threats; conversely, a database resourceful enough has not been implemented to communicate the intelligence. The fusion centers can gain data and share the intelligence with the Department of Homeland Security but it cannot correspond with any other centers countrywide. Thus, if one state such as California gains intelligence on a potential terrorist group living nearby, the intelligence is transmitted to the Department of Homeland Security, but neighboring states are uninformed about the activity. The drawback with executing the system to synchronize information countrywide is that the fusion centers would have to gain funding from the federal government.…

    • 561 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Contraband In Prisons

    • 977 Words
    • 4 Pages

    This research seeks to understand what intelligence models or combination of models would efficiently work to effectively detect, deter, and prevent terrorist activities and organized criminal activity within the U.S. prison system. Each of the listed specialized units has an area of intelligence they cover. This could be from local gangs to terrorist organizations. In addition to gathering intelligence, the ability to properly analyze and share this information should be at the forefront of every…

    • 977 Words
    • 4 Pages
    Good Essays
  • Good Essays

    CaseEF GroupC2 Team10

    • 1421 Words
    • 4 Pages

    Although the Tucson data-mining project may inappropriately violate the privacy of Internet, it is an acceptable tradeoff to more intelligently combat terrorism users because it is so far, one of the best way.…

    • 1421 Words
    • 4 Pages
    Good Essays
  • Good Essays

    Week 4. Team Reflection

    • 609 Words
    • 2 Pages

    Team A’s members range from a project manager who attained their Associate Degree in Computer Programming, a logistics specialists for Amazon.com, a employee in the Hilton Hotel industry, and a wine educator working in hospitality—all very different fields with varying levels of information systems background. While for some it was a review, Team A can all agree that each and every one of us gained a better understanding on how and why information systems accomplish business objectives. Cheryl knew the degree in which wireless technologies kept users plugged into the World Wide Web. She was aware that smartphones and their many accessories allowed users to access their emails, schedules, mobile banking and participate in e-commerce as well as make online payments—she learned that M-commerce another growing trend. Due to telemedicine, modern technology has allowed the medical world to provide assistance via videoconferencing. In addition, she learned that setting up and using access points to create meshed networks called a Wide Area Network (WAN) (Rainer & Cegielski, 2011). Xavier learned the relevance of wireless technology in everyday life. More specifically, he learned of the different functions of varying satellite types to communicate information. Kelly learned about the two basic operations of data mining. According to Rainer and Cegielski (2011), data mining functions in “predicting trends and behaviors and identifying previously unknown patterns”…

    • 609 Words
    • 2 Pages
    Good Essays
  • Good Essays

    1984 - Reflection Paper

    • 694 Words
    • 3 Pages

    Since Orwell's book 1984 written in 1948, we have developed methods to produce more advanced and less costly computer technology. Value Added Networks continue to rise in popularity. Data warehousing (information availability) and data mining (information analysis) have become hot topics in today's world. Personal data that has always been available, but not easily accessible, is now computerized and merged with larger databases. These databases are linked to form massive data repositories. This practice is not limited to the private sector; government databases such as the Department of Motor Vehicles and criminal records are accessible to those willing to pay for access. The ability to desegregate personal information and profile individuals is easier than ever.…

    • 694 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    Journal of the American Society for Information Science & Technology; Mar2009, Vol. 60 Issue 3, p443-454, 12p, 6 Charts…

    • 335 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    Data Warehouses and Data Marts: A Dynamic View By Joseph M. Firestone, Ph.D. White Paper No. Three March 27, 1997…

    • 5149 Words
    • 21 Pages
    Powerful Essays
  • Satisfactory Essays

    analytics to make better security decisions, as well as understand the forces that shape the security…

    • 527 Words
    • 1 Page
    Satisfactory Essays
  • Powerful Essays

    Data Mining

    • 7962 Words
    • 32 Pages

    [10] Gell-Mann M., What is complexity? , Complexity, Vol 1, No 1, pp 16-19, 1995. [11] Shalizi C., Complexity Measures, available on http://cscs.umich.edu/~crshalizi/notebooks/complexity-measures.html, 2003. [12] Crutchfield J. P., Shalizi C., Thermodynamic Depth of Causal States: When Paddling around in Occam’s Pool Shallowness Is a Virtue, Santa Fe Insitute Working Paper 98-06-047, 1998. [13] Crutchfield J. P., Complexity: Order contra Chaos, in “Handbook of Metaphysics and Ontology”, Philosophia Verlag, Munich, 1989. [14] Gell-Mann M., Crutchfield J. P., Computation in Physical and Biological Systems Measures of Complexity, available on http://www.santafe.edu/research/measuringComplexity.php, 2004. [15] Calinescu A., Efstathiou J., Sivadasan S., Schirn J., Huaccho Huatuco L., Complexity in Manufacturing: An Information Theoretic Approach, Proceedings of the International Conference on Complex Systems and Complexity in Manufacturing, Warwick, pp 30-44, 2000. [16] Schuster P., How does complexity arise in evolution? , Complexity, Vol 2, pp 22-30, 1996. [17] Teece, D.J., Pisano, G. and Shuen, A., Dynamic capabilities and strategic management, Strategic Management Journal, Vol. 18 No. 7, pp. 509-33, 1997 [18] Westhoff F., Yarbrough B., Yarbrough R., Complexity, Organisation and Stuart Kauffman 's "The Origins of Order", Journal of Economic Behaviour and Organisation, Vol 29, No 1, pp 1-25, 1996. [19] Kauffman S., Escaping the Red Queen effect, The McKinsey Quarterly, Vol 1, pp 118-129, 1995. [20] Deshmukh A., Talavage J., Barash M., Complexity in manufacturing systems, Part 1: Analysis of static complexity, IIE Transactions, Vol 30, pp 645-655, 1998. 19…

    • 7962 Words
    • 32 Pages
    Powerful Essays
  • Good Essays

    Data Mining: Introduction Lecture Notes for Chapter 1 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Why Mine Data? Commercial Viewpoint O Lots of data is being collected and warehoused – Web data, e-commerce – purchases at department/ grocery stores – Bank/Credit Card transactions…

    • 2304 Words
    • 32 Pages
    Good Essays
  • Satisfactory Essays

    A business partner, Steve, and I are talking about starting a small, brick and mortar, nostalgic record store. Steve does not have much experience with information systems or technology. A basic understanding of the different types of information systems available for the business to use can be helpful in him gaining experience. To help my partner become familiar with the systems, providing him with information of each system will provide Steve with insight on what to expect. Various types of information systems are helpful, and many could work, but each one comes with benefits as well as drawbacks.…

    • 391 Words
    • 2 Pages
    Satisfactory Essays
  • Better Essays

    Data Mining

    • 11085 Words
    • 45 Pages

    Abstract Steganography is one of the methods used for the hidden exchange of information and it can be defined as the study of invisible communication that usually deals with the ways of hiding the existence of the communicated message. In this way, if successfully it is achieved, the message does not attract attention from eavesdroppers and attackers. Using steganography, information can be hidden in different embedding mediums, known as carriers. These carriers can be images, audio files, video files, and text files. The focus in this paper is on the use of an image file as a carrier, and hence, the taxonomy of current steganographic techniques for image files has been presented. These techniques are analyzed and discussed not only in terms of their ability to hide information in image files but also according to how much information can be hidden, and the robustness to different image processing attacks. Keywords: Adaptive Steganography, Current Techniques, Image Files, Overview, Steganography, Taxonomy.…

    • 11085 Words
    • 45 Pages
    Better Essays