Normalization First Normal Form (1st NF)
• • • •
The table cells must be of single value. Eliminate repeating groups in individual tables. Create a separate table for each set of related data. Identify each set of related data with a primary key.
Do not use multiple fields in a single table to store similar data. For example, to track an inventory item that may come from two possible sources, an inventory record may contain fields for Vendor Code 1 and Vendor Code 2. But what happens when you a dd a third vendor? Adding a field is not the answer; it requires program and table modifications and does not smoothly accommodate a dynamic number of vendors. Instead, place all vendor information in a separate table called Vendors, then link inventory t o vendors with an item number key, or vendors to inventory with a vendor code key.
Second Normal Form (2nd NF)
If it's in 1st NF and if the Primary Key is composite (multiple columns) then any fact in that table must be a fact about the entire composite Primary Key not just part of the Primary Key. For example, if an inventory table has a primary key made up of two attributes PartId and WarehouseId. Suppose the inventory table has the warehouse address in it, since warehouse address is a fact about WarehoseId and not about the PartId the w arehouse address is in the wrong table. This is in violation of the 2nd Normal Form.
Third Normal Form (3rd NF)
If it's in the 2nd NF and there are no non-key fields that depend on attributes in the table other than the Primary Key.
Suppose in the Student table you had student birth date as an attribute and you also had student's age. Students age depends on the student's birth date (a fact about his/her birth date) so 3rd Normal Form is violated. Also, a student table that contains The source of this handout is http://phoenix.liu.edu/~vasilaky/webdatabase/Topics/Normalization.html
CS 649 Database Management Systems the address of the Department of his/her major. That address is in the wrong table, because that address should be in the Department table. That address is not a fact about the Student Primary Key. A violation of 3rd Normal Form. A (non-key) fact in a table should be about the key, the whole key, and nothing but the key.
EXCEPTION: Adhering to the third normal form, while theoretically desirable, is not always practical. If you have a Customers table and you want to eliminate all possible interfield dependencies, you must create separate tables for cities, ZIP code s, sales representatives, and any other factor that may be duplicated in multiple records. In theory, normalization is worth pursing; however, many small tables may degrade performance or exceed open file and memory capacities. But the worst thing is to have to join too many tables in queries. Joining tables is the most expensive operation in time and memory cost. It may be more feasible to apply third normal form only to data that changes frequently. If some dependent fields remain, design your application to require the user to verify all related fields when any one is changed. Normalization Examples: UN-normalized table: Students Student# Advisor Adv-Room Class1 Class2 Class3 ------------------------------------------------------1022 Jones H412 101-2 112-01 155-01
First Normal Form: NO REPEATING GROUPS Tables should have only two dimensions. Since one student has several classes, these classes should be listed in a separate table. Fields Class1, Class2, & Class3 in the above record are indications of design trouble. The next table gets rid of repeating group, class but it is now in violation of the 2nd NF ( Normal Form) since Student# is no longer the Primary Key (it now repeats in the table) but Stdent# and Class# can be the Primary Key. But now Advisor is a fact about the Stdent# and not a fact about Class#, a violation of
The source of this handout is...