Normalization of Database Tables
Why are some table structures considered to be bad and others good and how do you recognize the difference between good and bad structures?
From an information management point of view, possibly the most vexing and destructive problems are created through uncontrolled data redundancies. Such redundancies produce update and delete anomalies that create data integrity problems. The loss of data integrity can destroy the usefulness of the data within the database. (If necessary, review Chapter 1, Section 1.4.4, “Data Redundancy”, to make sure that your students understand the terminology and that they appreciate the dangers of data redundancy.)
Table structures are poor whenever they promote uncontrolled data redundancy. For example, the table structure shown in Figure IM5.1 is poor because it stores redundant data. In this example, the AC_MODEL, AC_RENT_CHG, and AC_SEATS attributes are redundant. (For example, note that the hourly rental charge of $58.50 is stored four times, once for each of the four Cessna C-172 Skyhawk aircraft – check records 1, 2, 4, and 9.)
Figure IM5.1 A Poor Table Structure
|[pic] The figures shown in this discussion show the contents of the IM_Discussion database. This database is located on the teacher’s CD. |
If you use the AIRCRAFT_1 table as shown in Figure IM5.1, a change in hourly rental rates for the Cessna 172 Skyhawk must be made four times; if you forget to change just one of those rates, you have a data integrity problem. How much better it would be to have critical data in only one place! Then, if a change must be made, it need be made only once.
In contrast to the poor AIRCRAFT_1 table structure shown in Figure IM5.1, table structures are good when they preclude the possibility of producing uncontrolled data redundancies. You can produce such a happy circumstance by splitting the AIRCRAFT_1 table shown in Figure IM5.1 into the AIRCRAFT and MODEL tables shown in Figures IM5.2 and IM5.3, respectively. To retain access to all of the data originally stored in the AIRCRAFT_1 table, these two tables can be connected through the AIRCRAFT table's foreign key, MOD_CODE.
Figure IM5.2 The Revised AIRCRAFT Table
Figure IM5.3 The MODEL Table
Note that – after the revision -- a rental rate change need be made in only one place and the number of seats for each model is given in only one place. No more data update and delete anomalies -- and no more data integrity problems. The relational diagram in Figure IM5.4 shows how the two tables are related.
Figure IM5.4 The Relational Diagram
What does normalization have to do with creating good tables, and what's the point of having to learn all these picky normalization rules?
Normalization provides an organized way of determining a table's structural status. Better yet, normalization principles and procedures provide a set of simple techniques through which we can achieve the desired and definable structural results.
Without normalization principles and procedures, we lack evaluation standards and must rely on experience (and yes, some intuition) to minimize the probability of generating data integrity problems. The problem with relying on experience is that we usually learn from experience by making errors. While we're learning, who and what will be hurt by the errors we make?
Relying on intuition may work reasonably well for some, but intuitive work habits seldom create design consistency. Worse, you can't teach intuition to those who follow in your database footsteps.
In short, the normalization principles and rules drastically decrease the likelihood of producing bad table structures, they help standardize the process of producing good tables, and they make it possible to transmit skills to the next generation of database designers.