Starer: a Conceptual Model for Data Warehouse Design

Only available on StudyMode
  • Topic: Data warehouse, Conceptual schema, Star schema
  • Pages : 18 (4879 words )
  • Download(s) : 36
  • Published : February 28, 2013
Open Document
Text Preview
starER: A Conceptual Model for Data Warehouse Design
Nectaria Tryfona, Frank Busborg, and Jens G. Borch Christiansen Department of Computer Science, Aalborg University, Fredrik Bajersvej 7E, DK-9220, Aalborg Øst, Denmark {tryfona, dux, jbc} Abstract. Modeling data warehouses is a complex task focusing, very often, into internal structures and implementation issues. In this paper we argue that, in order to accurately reflect the users requirements into an error-free, understandable, and easily extendable data warehouse schema, special attention should be paid at the conceptual modeling phase. Based on a real mortgage business warehouse environment, we present a set of user modeling requirements and we discuss the involved concepts. Understanding the semantics of these concepts, allow us to build a conceptual model− namely, the starER model− for their efficient handling. More specifically, the starER model combines the star structure, which is dominant in data warehouses, with the semantically rich constructs of the ER model; special types of relationships have been further added to support hierarchies. We present an evaluation of the starER model as well as a comparison of the proposed model with other existing models, pointing out differences and similarities. Examples from a mortgage data warehouse environment, in which starER is tested, reveal the ease of understanding of the model, as well as the efficiency in representing complex information at the semantic level. Keywords: data warehouse, conceptual modeling, star structure, ER model.

1 Introduction
A data warehouse is a collection of consistent, subject-oriented, integrated, time-variant, non-volatile data and processes on them, which are based on available information and enable people to make decisions and predictions about the future [7]. Over the last years, data warehouses enjoy a lot of attention both from the industrial and the research community. The reason lies in their great importance: making predictions about the (near) future, has always been desirable for business companies. Data warehouse design has hitherto focused on the physical data organization (i.e., the "internal" structure) and quite understandable so, because of the volume and the complexity of data. Following the logical structure of data, as described in a data warehouse, several schemas have been developed emphasizing on the star-oriented approach; data unfolds around facts occurring in businesses. The star [1], the starflake [12], and the snowflake schema [8] are used widely for this purpose. Although all of these schemas provide some level of modeling abstraction that is understandable to the user, they are not built having his/her needs in mind.

Our position is that data warehouse modeling− as exactly databases do, many years now− should be exposed, to a higher level of design, that is understandable to the user, independent of implementation issues, and that does not use any computer metaphors, such as "table" or "field". The result of this process will be a schema that is formal and complete, so that it can be transformed into the next logical schema without ambiguities. This is the conceptual or semantic modeling phase, and the benefits of its use have been praised a lot: communication between the designer and the user, early detection of modeling errors, and easily extendable schemas are among them. The conceptual modeling phase is part of a design methodology− which is classical in the database area, and has been already proposed [6] for the data warehouse area− following the user requirements analysis and specifications phase and, is followed by the logical design focusing on workload refinement and schema validation. In this paper we firstly address the modeling requirements of a data warehouse, from the user point of view. For this purpose, we use a real mortgage business environment. The understanding of the requirements reveals a set of concepts that need to be...
tracking img