Overview

As we examine the E/R diagram, we see that we have diagrammed the enterprise's entities, attributes and relationships.  We now need to convert the E/R diagram into a new model that will be able to be used to create the actual database.

The model that we will use is that of Normalized Tables.  We will create these normalized tables through a process called Normalization.  A normalized table is one that has particular properties that we will examine in more detail below.

The logical model of normalized tables is based upon the Relational Data Model.

The Relational Data Model

This database model was created by E. F. Codd in 1970.  From this research the Relational Database Model was created and new commercial products (called RDBMS - Relational Database Management Systems) were created.

The Relational Database Model uses the 2-dimensional table as the data structure.  Tables in the database contain both rows - called tuples, and columns - called attributes.  Each cell of a column is equivalent to a field, and each row is equivalent to a record.  The whole database table called a relation is equivalent to a file.

Within each row will be a primary key attribute.  This Primary key may be an individual attribute or combination of 2 or more attributes together (called a composite primary key).  This primary key uniquely identifies each row in the relation (table).  The primary keys are the identifiers from the E/R diagram.

Relations (tables) have certain properties. 

1.  Each relation in a particular database has a unique name.

2.  An entry at the inersection of each row and column is atomic (you can have only one value in the attribute - NO multivalues are allowed)

3.  Each row is unique (no two rows are identical in the relation) - this is done with the primary key

4.  Each attribute (or column) within a table has a unique name

5.  The sequence of columns (left to right) is insignificant - in other words no sorting for the stored data

6.  The sequence of rows (top to bottom) is insignificant - again no sorting or the rows

Relationships in the Database

Relations (tables) in the database are related through two types of keys - the primary key (mentioned above) and another type of attribute called the foreign key.  A foreign key is a non-primary key attribute that exists in one relation, but exists as a primary key in another relation.

Example:    CUSTOMER    (Cust_ID, Cust_lastname,Cust_firstname)

                  SHIPMENT   (Shipment_ID, Cust_ID, Vendor_ID, Ship_date)

                  VENDOR  (Vendor_ID, Vendor_street,Vendor_city,Vendor_state)

In the example you can see that the Cust_ID attribute is a primary key in the CUSTOMER table, and it is a non-primary key in the SHIPMENT table.  You can also see that the Vendor_ID is a primary key in the VENDOR table and a non-primary key attribute in the SHIPMENT table.  BOTH the Cust_ID and the Vendor_ID attributes in the SHIPMENT table are called foreign key attributes.

Thus - all three relations (tables) in the example above are RELATED to each other through the primary and foreign keys.

Integrity Constraints of Relations

There are several "constraints" or business rules whose purpose is to facilitate maintaining accuracy and integrity of the data in the database.

Domain Constraints - all of the values that appear in a column of a relation must be taken from the same domain (set of values that may be assigned to an attribute, i.e. a social security attribute can only have 9 numerics in the attribute values).

Entity Integrity - rule that ensures that every relation has a primary key and that the data values for that primary key are ALL valid.  In particular, it guarantees that no primary key value is NULL.

Referential Integrity Constraint - a rule that states that either each foreign key value must match a primary key value in another relation OR the foreign key value MUST be NULL.

Well Structured Relations

In a relational database model we want to have tables that are "well structured".  A well structured relation is one that contains minimal redundancy and allows users to insert, modify and delete the rows in a table without errors or inconsistencies.

A well-structured relation is one that has every NON-PRIMARY KEY attribute functionally dependent on the PRIMARY KEY attribute.  Or reversing the statement:  The PRIMARY KEY attribute DETERMINES each non-primary key attribute.

This is based on relational algebra, where A determines B or the Value of A determines what the Value of B MUST be.    Thus  A (value) -> B (value)  A is called the DETERMINANT attribute.

The term "functionally dependent" means that the non-primary key attributes have values that are "determined" by the value of the primary key.

Anomaly

A well-structured relation is one that can have data and rows  inserted, modified or rows deleted without errors or inconsistencies.  Relations that do not meet these criteria are said to have an ANOMALY.  There are 3 types of anomalies:

1.  insertion anomaly - where we try to enter a new row (new record) and have difficulty because either the primary key would be equal, or you need to also have data for the foreign key that would be different (to make the row unique)

2.  deletion anomaly - when you delete a row of data, you lose information that is contain in other attributes in that row that aren't contained anywhere else

3.  modification anomaly - where you need to update an attribute that will require you to make updates to more than one row (when you didn't want to do that)

Transforming E/R diagrams into a group of Well-Structured Relations

In order to be ready to create the database needed to support the enterprise, you need to transform the E/R diagram into a set of well-structured relations (tables).  It is difficult to do this with a CASE tool.  Instead, it pretty much comes down to doing this by hand.

Process of Normalization

The process of transforming your E/R diagram into the well-structured relations is called Normalization.  Normalization has 3 major steps (although there are actually 6 steps the last three are not usually done).

1.  Step 1 called FIRST NORMAL FORM - in First Normal Form all multi-valued attributes are removed so there are only single values in each cell

To accomplish this step you look at all of your attributes contained in the entity - remember the "skill" attribute mentioned above?  This is an attribute that could contain several different values for the same EMPLOYEE.  You will then need to create a new table - this one has not only the Primary key of EMP_ID, but will also add the multivalued attribute as PART of that primary key - creating a new COMPOSITE PRIMARY KEY (Emp_ID,Skill,Emp_lastname,Emp_firstname,Emp_street, etc.....)

2.  The next step is SECOND NORMAL FORM - Second Normal Form is where all partial functional dependencies are removed.  A Partial functional dependency occurs ONLY when you have a COMPOSITE PRIMARY KEY and some of the non-primary key attributes are depend on only part of the composite primary key and not the whole thing.

To remedy this situation - you must first decide what the parts of the primary key determine.  If you have a primary key  A,B in a relation, and non-primary key attributes C, D, E, F and G then you need to find out what A,B together, A alone and B alone determine in the rest of the attributes.  Remember - it's the value of A,B, A or B that determines what the values of the rest of the attributes must be.

Let's assume in our example that:

 A,B (together) determine C and D

A (alone) determines E and F

B (alone) determines G

We, therefore, have 3 Determinants:  A,B (together), A (alone) and B (alone) for this table.  In order to create a table that is in Second Normal Form - we need to remove all partial functional dependencies.  Attributes E and F are partially functionally dependent on only PART of the primary key (thus, the partial functional dependency), and G is partially functionally dependent on only part of the primary key as well.  Only attributes C and D are fully functionally dependent (as they should be) on the complete composite primary key.

To fix our problem of partial functional dependencies we need to create a NEW relation (table) for EACH of our DETERMINANTS - thus we had 3 determinants and we will therefore, have 3 NEW tables.  The tables will contain the primary key determinant (either the complete composite primary key, or the partial primary key) and the particular attributes that it determines.  Our new tables will be:

Table 1:       

A

B

C

D

Table 2:

A E F

Table 3:

B G

The tables are still all related through their primary key values - Table 1 (attribute A) relates to Table 2 (attribute A), and Table 1 (attribute B) relates to Table 3 (attribute B).  Tables 2 and 3 do not directed relate to each other - but are related through Table 1.

3.  Step 3 - THIRD NORMAL FORM - Third Normal form is when we remove any transitive dependencies that might exist in a relation.  A transitive dependency is where a non-primary key attribute determines another non-primary key attribute in the same relation (a functional dependency exists). 

In order to fix this problem, you need to examine your tables that are already in 2nd normal form, for any additional transitive dependencies between other non-primary key attributes.  For instance in the example below there is a transitive dependency where Cust_ID determines both Cust_lastname,Cust_street

example:

Order_ID Order_date Cust_ID Cust_lastname Cust_street

In order to remedy this problem of transitive dependencies - you need to create a new table that uses the transitive determinant (Cust_ID) as the primary key in the new table and the two attributes that it determines - Cust_lastname, Cust_street

You are left with two attributes - Order_ID (primary key) and Order_date for the other table.

However, if you look at the new tables, they don't share any common key value - so they are no longer related.  To remedy this situation - you leave a copy of the Cust_ID in the Order table as an attribute.  Cust_ID becomes a foreign key - remember a non-primary key attribute in one table, and a primary key in the other table.

Your new tables then are:  ORDER table and CUSTOMER table

Order_ID Order_date Cust_ID
Cust_ID Cust_lastname Cust_street

The last 3 NORMAL FORMS are Boyce-Codd Normal Form, Fourth Normal Form and Fifth Normal form.  You can read about these in your textbook with examples in the appendix.

When you finish 3rd Normal Form - you now have a set of well-structured relations that you can then put into a Relational Database.

Merging Relations

There is one final step before you would create these tables in your data base.  This is called merging your relations.

Go back to your original E/R diagram with the cardinalities - yes, there was a reason for doing these!!  You will remember that we have the following types of cardinatities (degrees of relationships between our entities):

manatory one     manatory many      optional one    optional many

We will be ignoring our mandatory and optional cardinalities at this point and will be concentrating on the "one" and "many" relationships.  You can have three possibilities for the "one" and "many" cardinalities between two entities: 

1.  A "one" to" one" relationship (give one instance of entity A you have one instance of entity B - remember don't worry about mandatory and optional; and going the other way, given one instance of entity B, you have one instance of entity A)

2.  A "one" to "many" relationship (given one instance of entity A you have more than one instance of entity B)

3.  A "many" to "many" relationship (given one instance of entity A you have more than one instance of entity B, and given one instance of entity B you have more than one instance in entity A)

To begin - look at the type of relationship you have between any two entities (one-to-one, one-to-many, many-to-many), and follow the following process:

a.  If you have a one-to-one relationship - go to your set of well-structured relations and find the table(s) that correspond to the entities on your E/R diagram that you were looking at.  You may have several new tables that have been created from the original entity table, but choose the ones that are the "main" tables closest to the original entity.

Take the primary key from either table (A or B) and ADD it as a foreign key to the other table (you only need to choose one primary key from one table - you don't need to go both ways).  This way you have now established a relationship between these two tables.

b.  If you have a one-to-many relationship between entities A and B on your E/R diagram, then take the primary key from entity A (the "one" side) and put it as a foreign key in entity B (the "many" side)

c.  If you have a many-to-many relationship between entities A and B on your E/R diagram, then you need to CREATE a NEW TABLE that only contains the primary keys from both A and B.  This new table then becomes the way that these tables will be related in the database.

Continue doing this process with ALL entities on your E/R diagram.  You will be adding foreign keys into several of your normalized tables, and possibly adding new "key" tables.