Preview

Knowledge-Based Visualization to Support Spatial Data Mining

Powerful Essays
Open Document
Open Document
3691 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Knowledge-Based Visualization to Support Spatial Data Mining
Knowledge-Based Visualization to Support Spatial Data Mining
Gennady Andrienko and Natalia Andrienko
GMD - German National Research Center for Information Technology Schloss Birlinghoven, Sankt-Augustin, D-53754 Germany gennady.andrienko@gmd.de http://allanon.gmd.de/and/

Abstract. Data mining methods are designed for revealing significant relationships and regularities in data collections. Regarding spatially referenced data, analysis by means of data mining can be aptly complemented by visual exploration of the data presented on maps as well as by cartographic visualization of results of data mining procedures. We propose an integrated environment for exploratory analysis of spatial data that equips an analyst with a variety of data mining tools and provides the service of automated mapping of source data and data mining results. The environment is built on the basis of two existing systems, Kepler for data mining and Descartes for automated knowledge-based visualization. It is important that the open architecture of Kepler allows to incorporate new data mining tools, and the knowledge-based architecture of Descartes allows to automatically select appropriate presentation methods according to characteristics of data mining results. The paper presents example scenarios of data analysis and describes the architecture of the integrated system.

1

Introduction

The notion of Knowledge Discovery in Databases (KDD) denotes the task of revealing significant relationships and regularities in data based on the use of algorithms collectively entitled ”data mining”. The KDD process is an iterative fulfillment of the following steps [6]: 1. Data selection and preprocessing, such as checking for errors, removing outliers, handling missing values, and transformation of formats. 2. Data transformations, for example, discretization of variables or production of derived variables. 3. Selection of a data mining method and adjustment of its parameters. 4. Data mining, i.e.



References: 1. Andrienko, G., and Andrienko, N.: Intelligent Visualization and Dynamic Manipulation: Two Complementary Instruments to Support Data Exploration with GIS. In: Proceedings of AVI’98: Advanced Visual Interfaces Int. Working Conference (L’Aquila Italy, May 24-27, 1998), ACM Press (1998) 66-75 2. Brodley, C.: Addressing the Selective Superiority Problem: Automatic Algorithm / Model Class Selection. In: Machine Learning: Proceedings of the 10th International Conference, University of Massachusetts, Amherst, June 27-29, 1993. San Mateo, Calif.: Morgan Kaufmann (1993) 17-24 3. Cook, D., Symanzik, J., Majure, J.J., and Cressie, N.: Dynamic Graphics in a GIS: More Examples Using Linked Software. Computers and Geosciences, 23 (1997) 371-385 4. Gama, J. and Brazdil, P.: Characterization of Classification Algorithms. In: Progress in Artificial Intelligence, Lecture Notes in Artificial Intelligence, Vol.990. Springer-Verlag: Berlin (1995) 189-200 5. Gebhardt, F.: Finding Spatial Clusters. In: Principles of Data Mining and Knowledge Discovery PKDD97, Lecture Notes in Computer Science, Vol.1263. SpringerVerlag: Berlin (1997) 277-287 6. Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P.: The KDD Process for Extracting Useful Knowledge from Volumes of Data. Communications of the ACM, 39 (1996), 27-34 7. John, G.H.: Enhancements to the Data Mining Process. PhD dissertation, Stanford University. Available at the URL http://robotics.stanford.edu/∼gjohn/ (1997) 8. Kodratoff, Y.: From the art of KDD to the science of KDD. Research report 1096, Universite de Paris-sud (1997) 9. Koperski, K., Han, J., and Stefanovic, N.: An Efficient Two-Step Method for Classification of Spatial Data. In: Proceedings SDH98, Vancouver, Canada: International Geographical Union (1998) 45-54 10. MacDougall, E.B.: Exploratory Analysis, Dynamic Statistical Visualization, and Geographic Information Systems. Cartography and Geographic Information Systems, 19 (1992) 237-246 11. Wrobel, S., Wettschereck, D., Sommer, E., and Emde, W.: Extensibility in Data Mining Systems. In Proceedings of KDD96 2nd International Conference on Knowledge Discovery and Data Mining. AAAI Press (1996) 214-219

You May Also Find These Documents Helpful

  • Good Essays

    What are analytical tools and concepts do geographers use? Cartogography and counter mapping. Uncover patterns, unique characteristics, and global interdependencies between places…

    • 919 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    ESRI

    • 6993 Words
    • 28 Pages

    ““The Web is the new platform for GIS" and it is becoming "geographically enabled." It will "change the way we do things and the way we talk about them…GIS on the Web, provides many additional possibilities for sharing, integrating, and leveraging the full stack of geographic knowledge," allowing users to "share maps and data, models, analyses." This, he argued, will…

    • 6993 Words
    • 28 Pages
    Powerful Essays
  • Satisfactory Essays

    Geographic Profiling

    • 261 Words
    • 2 Pages

    Geographical profiling can be conducted using specialist computer programmes or by employees of the police force whom have been trained in the area of geographical profiling (Bennell, Canter & Snook, 2002).…

    • 261 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    The use of visualization to present information is not a new phenomenon. It has been used in maps, scientific drawings, and data plots for over a thousand years. Examples from cartography include Ptolemy's Geographia (2nd Century AD), a map of China (1137 AD), and Minard's map (1861) of Napoleon's invasion of Russia half a century earlier. Most of the concepts learned in devising these images carry over in a straight forward manner to computer visualization. Edward Tufte has written two critically acclaimed books that explain many of these principles.…

    • 573 Words
    • 3 Pages
    Good Essays
  • Powerful Essays

    Furner, J. M., & Ramirez, M. (1999). Making Connections: Using GIS to Integrate Mathematics and Science. TechTrends, 43(4), 34-39.…

    • 4055 Words
    • 17 Pages
    Powerful Essays
  • Satisfactory Essays

    Given of the objects to be observed, the team has decided that a data mining approach following a stratified sampling methodology is most appropriate. This approach allows for the division of the large number of data points into manageable groups while still being able to include the full six year history.…

    • 545 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    GIS

    • 1054 Words
    • 10 Pages

     This tool was first developed in Canada in the 1960s for the “systematic digitization and scanning of maps, analysis of data, and output of new map products” (Tulloch 2000). GIS PHOTO GEOGRAPHICAL INFORMATION SYSTEM  It provides: a. Maps b. Tables c. Area specific data sets: - population distribution - land classification - vegetation cover - ownership - topography - land use - slope - soil - geology APPLICATION OF GIS TO THE STUDY OF BIODIVERSITY AT THE LANDSCAPE LEVEL…

    • 1054 Words
    • 10 Pages
    Satisfactory Essays
  • Powerful Essays

    Fedex Strategic Planning

    • 6196 Words
    • 25 Pages

    Abstract ESRI GIS technologies are being used at FedEx Express to solve complex business problems in both the planning and execution of the daily delivery process. The application of spatial data at FedEx is unique in that it is being used to support several mission critical, multi user applications and processes worldwide. Spatial data is being implemented within the organization for use in decision making for the routing and scheduling of thousands of pickup and delivery vehicles on a daily basis. It is aimed at minimizing costs such as mileage, overtime of workforce, efficient routing, and effective delivery methods, leading to higher productivity and greater customer satisfaction. The dynamic nature of the daily execution as a business problem, when coupled with the analysis of historical events, GIS spatial data, customer data, and resource information can predict best practices for delivery methods and increased productivity.…

    • 6196 Words
    • 25 Pages
    Powerful Essays
  • Powerful Essays

    School of Computing National University of Singapore 3 Science Drive 2 Singapore 117543 {liub, whsu, xiayy}@comp.nus.edu.sg Abstract. Much of the data mining research has been focused on devising techniques to build accurate models and to discover rules from databases. Relatively little attention has been paid to mining changes in databases collected over time. For businesses, knowing what is changing and how it has changed is of crucial importance because it allows businesses to provide the right products and services to suit the changing market needs. If undesirable changes are detected, remedial measures need to be implemented to stop or to delay such changes. In many applications, mining for changes can be more important than producing accurate models for prediction. A model, no matter how accurate, can only predict based on patterns mined in the old data. That is, a model requires a stable environment, otherwise it will cease to be accurate. However, in many business situations, constant human intervention (i.e., actions) to the environment is a fact of life. In such an environment, building a predictive model is of limited use. Change mining becomes important for understanding the behaviors of customers. In this paper, we study change mining in the contexts of decision tree classification for real-life applications.…

    • 4961 Words
    • 20 Pages
    Powerful Essays
  • Powerful Essays

    Tic-Tac-Toe - Data Mining

    • 1778 Words
    • 8 Pages

    Data Mining – A tutorial based primer, Richard J. Riger and Michael W. Geatz, Second impression 2008, Pearson Education Inc.…

    • 1778 Words
    • 8 Pages
    Powerful Essays
  • Powerful Essays

    Gis Project

    • 1725 Words
    • 8 Pages

    Geographic Information Systems or GIS is the collection data, which is stored in a database that is then analyzed and used to create a visual representation of the data in the form of a Map Figure 1.…

    • 1725 Words
    • 8 Pages
    Powerful Essays
  • Good Essays

    University of Piraeus, Department of Informatics, 80, Karaoli & Dimitriou St., 185 34 Piraeus, Athens, Greece…

    • 10367 Words
    • 42 Pages
    Good Essays
  • Good Essays

    GISVO es un Sistema de Información Geográfico (SIG) orientado a resolver los problemas de las pequeñas y medianas empresas –distribuidoras principalmente-, relacionados a la definición, verificación y control rutas. GISVO se desarrolló usando tecnología Java y la aplicación de una estrategia de rastreo pasivo.…

    • 2153 Words
    • 9 Pages
    Good Essays
  • Powerful Essays

    knowledge discovery process approach on these data. Now-aday a new research community, educational data mining…

    • 2994 Words
    • 14 Pages
    Powerful Essays
  • Powerful Essays

    Data Preprocessing

    • 17962 Words
    • 72 Pages

    Today’s real-world databases are highly susceptible to noisy, missing, and inconsistent data due to their typically huge size (often several gigabytes or more) and their likely origin from multiple, heterogenous sources. Low-quality data will lead to low-quality mining results. “How can the data be preprocessed in order to help improve the quality of the data and, consequently, of the mining results? How can the data be preprocessed so as to improve the efficiency and ease of the mining process?” There are several data preprocessing techniques. Data cleaning can be applied to remove noise and correct inconsistencies in data. Data integration merges data from multiple sources into a coherent data store such as a data warehouse. Data reduction can reduce data size by, for instance, aggregating, eliminating redundant features, or clustering. Data transformations (e.g., normalization) may be applied, where data are scaled to fall within a smaller range like 0.0 to 1.0. This can improve the accuracy and efficiency of mining algorithms involving distance measurements. These techniques are not mutually exclusive; they may work together. For example, data cleaning can involve transformations to correct wrong data, such as by transforming all entries for a date field to a common format. In Chapter 2, we learned about the different attribute types and how to use basic statistical descriptions to study data characteristics. These can help identify erroneous values and outliers, which will be useful in the data cleaning and integration steps. Data processing techniques, when applied before mining, can substantially improve the overall quality of the patterns mined and/or the time required for the actual mining. In this chapter, we introduce the basic concepts of data preprocessing in Section 3.1. The methods for data preprocessing are organized into the following categories: data cleaning (Section 3.2), data integration (Section 3.3), data reduction…

    • 17962 Words
    • 72 Pages
    Powerful Essays

Related Topics