Top-Rated Free Essay
Preview

Database Slides on Normalization

Good Essays
2087 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Database Slides on Normalization
Chapter 11
Relational Database Design Algorithms and Further Dependencies

Chapter Outline
   

  

0. Designing a Set of Relations 1. Properties of Relational Decompositions 2. Algorithms for Relational Database Schema 3. Multivalued Dependencies and Fourth Normal Form 4. Join Dependencies and Fifth Normal Form 5. Inclusion Dependencies 6. Other Dependencies and Normal Forms

DESIGNING A SET OF RELATIONS


Goals:


Lossless join property (a must)


Algorithm 11.1 tests for general losslessness. Algorithm 11.3 decomposes a relation into BCNF components by sacrificing the dependency preservation. 4NF (based on multi-valued dependencies) 5NF (based on join dependencies)



Dependency preservation property




Additional normal forms
 

1. Properties of Relational Decompositions


Relation Decomposition and Insufficiency of Normal Forms:  Universal Relation Schema:


A relation schema R = {A1, A2, …, An} that includes all the attributes of the database. Every attribute name is unique.



Universal relation assumption:


(Cont)


Decomposition:




Attribute preservation condition:


The process of decomposing the universal relation schema R into a set of relation schemas D = {R1,R2, …, Rm} that will become the relational database schema by using the functional dependencies. Each attribute in R will appear in at least one relation schema Ri in the decomposition so that no attributes are “lost”.

(Cont)




Another goal of decomposition is to have each individual relation Ri in the decomposition D be in BCNF or 3NF. Additional properties of decomposition are needed to prevent from generating spurious tuples

(Cont)


Dependency Preservation Property of a Decomposition:




Definition: Given a set of dependencies F on R, the projection of F on Ri, denoted by pRi(F) where Ri is a subset of R, is the set of dependencies X → Y in F+ such that the attributes in X U Y are all contained in Ri. Hence, the projection of F on each relation schema Ri in the decomposition D is the set of functional dependencies in F+, the closure of F, such that all their left- and right-hand-side attributes are in Ri.

(Cont.)


Dependency Preservation Property of a Decomposition (cont.):


Dependency Preservation Property:




A decomposition D = {R1, R2, ..., Rm} of R is dependency-preserving with respect to F if the union of the projections of F on each Ri in D is equivalent to F; that is ((πR1(F)) U . . . U (πRm(F)))+ = F+ (See examples in Fig 10.12a and Fig 10.11)



Claim 1:


It is always possible to find a dependency-preserving decomposition D with respect to F such that each relation Ri in D is in 3NF.

Projection of F on Ri

Given a set of dependencies F on R, the projection of F on Ri, denoted by πRi(F) where Ri is a subset of R, is the set of dependencies X → Y in F+ such that the attributes in X ∪ Y are all contained in Ri.

Dependency Preservation Condition
Given R(A, B, C, D) and F = { A → B, B → C, C → D} Let D1={R1(A,B), R2(B,C), R3(C,D)} πR1(F)={A → B} πR2(F)={B → C} πR3(F)={C → D} FDs are preserved.

(Cont.)


Lossless (Non-additive) Join Property of a Decomposition:


Definition: Lossless join property: a decomposition D = {R1, R2, ..., Rm} of R has the lossless (nonadditive) join property with respect to the set of dependencies F on R if, for every relation state r of R that satisfies F, the following holds, where * is the natural join of all the relations in D:

* (π R1(r), ..., πRm(r)) = r



Note: The word loss in lossless refers to loss of information, not to loss of tuples. In fact, for “loss of information” a better term is “addition of spurious information”

Example
S s1 s2 s3 P p1 p2 p1 D d1 d2 d3 = S s1 s2 s3 P p1 p2 p1 * P p1 p2 p1 D d1 d2 d3

Lossless Join Decomposition ?? NO

(Cont.)
Lossless (Non-additive) Join Property of a Decomposition (cont.): Algorithm 11.1: Testing for Lossless Join Property Input: A universal relation R, a decomposition D = {R1, R2, ..., Rm} of R,and a set F of functional dependencies. 1. Create an initial matrix S with one row i for each relation Ri in D, and one column j for each attribute Aj in R. 2. Set S(i,j):=bij for all matrix entries. (/* each bij is a distinct symbol associated with indices (i,j) */). 3. For each row i representing relation schema Ri {for each column j representing attribute Aj {if (relation Ri includes attribute Aj) then set S(i,j):= aj;};};  (/* each aj is a distinct symbol associated with index (j) */)  CONTINUED on NEXT SLIDE

(Cont.)
4. Repeat

the following loop until a complete loop execution results in no changes to S {for each functional dependency X →Y in F {for all rows in S which have the same symbols in the columns corresponding to attributes in X {make the symbols in each column that correspond to an attribute in Y be the same in all these rows as follows: If any of the rows has an “a” symbol for the column, set the other rows to that same “a” symbol in the column. If no “a” symbol exists for the attribute in any of the rows, choose one of the “b” symbols that appear in one of the rows for the attribute and set the other rows to that same “b” symbol in the column ;}; }; }; 5. If a row is made up entirely of “a” symbols, then the decomposition has the lossless join property; otherwise it does not.

(Cont.)
Lossless (nonadditive) join test for n-ary decompositions. (a) Case 1: Decomposition of EMP_PROJ into EMP_PROJ1 and EMP_LOCS fails test. (b) A decomposition of EMP_PROJ that has the lossless join property.

(Cont.)
Lossless (nonadditive) join test for n-ary decompositions. (c) Case 2: Decomposition of EMP_PROJ into EMP, PROJECT, and WORKS_ON satisfies test.

(Cont.)


Testing Binary Decompositions for Lossless Join Property
 

Binary Decomposition: Decomposition of a relation R into two relations. PROPERTY LJ1 (lossless join test for binary decompositions): A decomposition D = {R1, R2} of R has the lossless join property with respect to a set of functional dependencies F on R if and only if either
 

The FD ((R1 ∩ R2) → (R1- R2)) is in F+, or The FD ((R1 ∩ R2) → (R2 - R1)) is in F+.

2. Algorithms for Relational Database Schema Design
Algorithm 11.3: Relational Decomposition into BCNF with Lossless (non-additive) join property Input: A universal relation R and a set of functional dependencies F on the attributes of R. 1. Set D := {R}; 2. While there is a relation schema Q in D that is not in BCNF do { choose a relation schema Q in D that is not in BCNF; find a functional dependency X → Y in Q that violates BCNF; replace Q in D by two relation schemas (Q - Y) and (X U Y); }; Assumption: No null values are allowed for the join attributes.

Algorithms for Relational Database Schema Design
Algorithm 11.4 Relational Synthesis into 3NF with Dependency Preservation and Lossless (Non-Additive) Join Property Input: A universal relation R and a set of functional dependencies F on the attributes of R. 1. Find a minimal cover G for F (Use Algorithm 10.2). 2. For each left-hand-side X of a functional dependency that appears in G, create a relation schema in D with attributes {X U {A1} U {A2} ... U {Ak}}, where X →A1, X → A2, ..., X → Ak are the only dependencies in G with X as left-hand-side (X is the key of this relation). 3. If none of the relation schemas in D contains a key of R, then create one more relation schema in D that contains attributes that form a key of R. (Use Algorithm 11.4a to find the key of R) 4. Eliminate redundant relations from the result. A relation R is considered redundant if R is a projection of another relation S

Algorithms for Relational Database Schema Design
Algorithm 11.4a Finding a Key K for R Given a set F of Functional Dependencies
Input: A universal relation R and a set of functional dependencies F on the attributes of R.
1. Set K := R; 2. For each attribute A in K {
Compute (K - A)+ with respect to F; If (K - A)+ contains all the attributes in R, then set K := K - {A};

}

(Cont.)

3. Multivalued Dependencies and Fourth Normal Form
(a) The EMP relation with two MVDs: ENAME —>> PNAME and ENAME —>> DNAME. (b) Decomposing the EMP relation into two 4NF relations EMP_PROJECTS and EMP_DEPENDENTS.

(Cont.)
(c) The relation SUPPLY with no MVDs is in 4NF but not in 5NF if it has the JD(R1, R2, R3). (d) Decomposing the relation SUPPLY into the 5NF relations R1, R2, and R3.

(Cont.)
Definition:


A multivalued dependency (MVD) X —>> Y specified on relation schema R, where X and Y are both subsets of R, specifies the following constraint on any relation state r of R: If two tuples t1 and t2 exist in r such that t1[X] = t2[X], then two tuples t3 and t4 should also exist in r with the following properties, where we use Z to denote (R -(X U Y)):


t3[X] = t4[X] = t1[X] = t2[X]. t3[Y] = t1[Y] and t4[Y] = t2[Y]. t3[Z] = t2[Z] and t4[Z] = t1[Z]. An MVD X —>> Y in R is called a trivial MVD if (a) Y is a subset of X, or (b) X U Y = R.







Multivalued Dependencies and Fourth Normal Form Definition:  A relation schema R is in 4NF with respect to a set of dependencies F (that includes functional dependencies and multivalued dependencies) if, for every nontrivial multivalued dependency X —>> Y in F+, X is a superkey for R.  Informally, whenever 2 tuples that have different Y values but same X values, exists, then if these Y values get repeated in separate tuples with every distinct values of Z {Z = R – (X U Y)} that occurs with the same X value.

(Cont.)

(Cont.)

Lossless (Non-additive) Join Decomposition into 4NF Relations:  PROPERTY LJ1’


The relation schemas R1 and R2 form a lossless (non-additive) join decomposition of R with respect to a set F of functional and multivalued dependencies if and only if


(R1 ∩ R2) —>> (R1 - R2) (R1 ∩ R2) —>> (R2 - R1)).



or


(Cont.)
Algorithm 11.5: Relational decomposition into 4NF relations with non-additive join property


Input: A universal relation R and a set of functional and multivalued dependencies F. Set D := { R }; While there is a relation schema Q in D that is not in 4NF do { choose a relation schema Q in D that is not in 4NF; find a nontrivial MVD X —>> Y in Q that violates 4NF; replace Q in D by two relation schemas (Q - Y) and (X U Y); };

1. 2.

4. Join Dependencies and Fifth Normal Form
Definition:  A join dependency (JD), denoted by JD(R1, R2, ..., Rn), specified on relation schema R, specifies a constraint on the states r of R.




The constraint states that every legal state r of R should have a non-additive join decomposition into R1, R2, ..., Rn; that is, for every such r we have * (πR1(r), πR2(r), ..., πRn(r)) = r

(Cont.)

Definition:  A relation schema R is in fifth normal form (5NF) (or Project-Join Normal Form (PJNF)) with respect to a set F of functional, multivalued, and join dependencies if,


for every nontrivial join dependency JD(R1, R2, ..., Rn) in F+ (that is, implied by F),


every Ri is a superkey of R.

Recap
   



Designing a Set of Relations Properties of Relational Decompositions Algorithms for Relational Database Schema Multivalued Dependencies and Fourth Normal Form Join Dependencies and Fifth Normal Form

Tutorial/Quiz 4

Q1) Consider a relation R with 5 attributes ABCDE, You are given the following dependencies: A → B, BC → E, ED → A a) List all the keys, b) Is R in 3 NF c) Is R in BCNF

Q2) Consider the following decomposition for the relation schema R = {A, B, C, D, E, F, G, H, I, J} and the set of functional dependencies F = { {A, B} → {C}, {A} → {D, E}, {B} → {F}, {F} → {G, H}, {D} -> {I, J} }.

Preserves Lossless Join and Dependencies? a) D1 = {R1, R2, R3, R4, R5}, R1={A,B,C} R2={A,D,E}, R3={B,F}, R4 = {F,G,H}, R5 = {D,I,J} b) D2 = {R1, R2, R3} R1 = {A,B,C,D,E} R2 = {B,F,G,H}, R3 = {D,I,J}

You May Also Find These Documents Helpful

  • Good Essays

    Week 1 SOW 1

    • 452 Words
    • 2 Pages

    Building the Physical Database: The data model will be translated into the Relational Database Management system of choice. All Tables and Columns with specific data types will be created and will include Relational and other constraints.…

    • 452 Words
    • 2 Pages
    Good Essays
  • Powerful Essays

    IST223 Crib sheet

    • 3425 Words
    • 7 Pages

    rectangles, and relationships are shown by lines between the rectangles. Attributes are generally listed within the rectangle. The many side of many relationships is represented by a crows footentity-relationship (E-R) modelA set of constructs and conventions used to create data models. The things in the users world are represented by entities, and the associations among those things are represented by relationships. The results are usually documented in an entity-relationship (E-R) diagramID-dependent entityan entity whose identifier includes the identifier of another entityidentifierwhich are attributes that name, or identify, entity instancesidentifying relationshipIn such relationships, the parent is always required, but the child (the ID-dependent entity) may or may not be required, depending on application requirements. Identifying relationships are shown with solid lines in E-R diagrams.is-aRelationships among supertype/subtype entitiesmandatoryat least one entity instance must participate in the relationshipmaximum cardinalityThe maximum cardinality is the maximum number of entity instances that can participate in a relationship instance.minimum cardinalityThe minimum cardinality is the minimum number of entity instances that must participate in a relationship instance.nonidentifying relationshiprelationship drawn with a dashed line (refer to Figure 5-7) is used between strong entities and is called a nonidentifying relationship because there are no ID-dependent entities in the relationship.null valueare a problem because they are ambiguous. They can mean that a value is inappropriate, unknown, or known, but not yet been entered into the databaseparentAn entity or row on the one side of a one-to-many relationshiprecursive relationshipoccurs when an entity type has a relationship to itself.relationship classAssociations among entity classesrelationship instanceassociations among entity instances.strong entityan entity that represents something that can exist…

    • 3425 Words
    • 7 Pages
    Powerful Essays
  • Satisfactory Essays

    cis3730_Exam1_Studyguide

    • 512 Words
    • 2 Pages

    Given a table or a set of tables, be able to specify their primary keys and foreign keys.…

    • 512 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    This memo is to inform you that your service request SR-ta-001 has reached my desk and currently is in the process of being completed. The purpose of this memo is to report the progress on the service requests. Currently I am in the stage of normalizing the database to the third normal form. By normalizing the database to the third level of normalization I will ensure that the data is efficiently organized within the database by removing duplicate and redundant data from the database relations. Normalizing the data to the normal third will help free up space and make it easier for the users to retrieve the data by elimination of the same data stored in more than one relation. In addition, normalizing to this level will also ensure data integrity, scalability, and storage efficiency. I have made sure that that there are no repeating columns or rows within the same tables and each column has only one value and one data type therefore the level of 1 normalization has been completed. I then continued to normalize the database to the second level of normalization by ensuring that each nonkey attribute is fully dependent on the key columns of the table. Once level one and two normal form had been completed I continued by establishing primary and foreign keys along with relationships. I will keep in contact with you as the project nears completion.…

    • 329 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    week 4 Discussion 1

    • 447 Words
    • 2 Pages

    Optimal database design recognizes proper organization of table structures and relationships. Suggest at least two methods that can be used to improve the design of a database system.…

    • 447 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    The design is now in 2NF and also in 3NF. For the course assignment, you may stop here. If you wish to bring the design to 4NF, split the relation with the multidependencies:…

    • 268 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    Cis 111

    • 790 Words
    • 4 Pages

    “The benefits of using the relational model would result in an easy to use, flexible, high performance database at a low cost.” (Anita, 2010) Therefore, allowing them to participate in the entire design process of the logical construction. The relational model provides an easy and flexible way of manipulating data with a system that offers:…

    • 790 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    Chapter 7 8 9 10

    • 2568 Words
    • 8 Pages

    In logical data base design, the E-R diagram is converted to the sets of tables and relationships among the tables.…

    • 2568 Words
    • 8 Pages
    Powerful Essays
  • Good Essays

    2. For your E/R diagram of exercise 1, (i) Select and specify keys, and (ii) Indicate appropriate referential integrity constraints.…

    • 1441 Words
    • 6 Pages
    Good Essays
  • Powerful Essays

    Logical Design

    • 1162 Words
    • 5 Pages

    The relational database model specified for the ACME Global Consulting organizes given data in table series. Each of the table has fields specified with specifications regarding the data type stored (Easytoteach, 2011). The advantages of the relational model includes…

    • 1162 Words
    • 5 Pages
    Powerful Essays
  • Good Essays

    Relational Databases

    • 750 Words
    • 3 Pages

    Data for our programs are stored in various formats. The most commonly used format is in the form of a database since databases are not dependent on a single program or hardware platform. These databases are stored in five structures of hierarchical, network, relational, multidimensional, and object-oriented. Hierarchical structure organizes data in a one to many relationship. The network goes of the hierarchical structure by organizing data in a many to many relationship. The relational structure organizes the data in tables and allows the end user to determine the relationship at run time. Multidimensional takes the relational structures concept and uses multi dimensions to organize the data. Object-oriented structure has not only the attributes of the entity, but also contain the operations that can be performed on it.…

    • 750 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    My Team Portion

    • 436 Words
    • 2 Pages

    The most important step when creating a database are detecting the table elements. While analyzing the form entitled “Entities and Attributes for Fleet Truck Maintenance,” the database table elements can be clearly determined. The structure of a database can be defined based on the Entities and Attributes within a table or a relation. Entities may be used to allow the same rule to be applied multiple times to make a determination ("Define an Entity", n.d.). An attribute basically describes what the entity is. The entity-relationship model (or ER model) is a way of graphically representing the logical relationships of entities (or objects) in order to create a database. The ER model was first proposed by Peter Pin-Shan Chen of Massachusetts Institute of Technology (MIT) in the 1970s (TechTarget, 2005-2015).…

    • 436 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    Database concepts

    • 586 Words
    • 3 Pages

    In database normalization, one of the important features of third normal form is that it excludes certain types of transitive dependencies. E.F. Codd, the inventor of the relational model, introduced the concepts of…

    • 586 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    Studies Matters

    • 609 Words
    • 3 Pages

    The attribute B is _________ the attribute A if each value in column A determines one and only one value in column B.…

    • 609 Words
    • 3 Pages
    Satisfactory Essays
  • Powerful Essays

    crypyography

    • 2299 Words
    • 10 Pages

    Each value in a tuple is an atomic value; that is, it is not divisible into components within the framework of the basic relational model. Hence, composite and multivalued attributes are not allowed. This model is sometimes called the flat relational model. Much of the theory behind the relational model was developed with this assumption in mind, which is called the first normal form assumption. Hence, multivalued attributes must be represented by separate relations, and composite attributes are represented only by their simple component attributes in the basic relational model.…

    • 2299 Words
    • 10 Pages
    Powerful Essays