This part of the lecture covers significantly more material than the other, hence we give the details of the references below:
Previous relational models have mistakes and limitations:
HW_Lecture
and Grade
tables, is difficult and error-prone.We could go back and forth between relational models (~ logical level) and SQL implementations (~ physical level), but we will use even more high-level tools (~ conceptual level):
Feature | Conceptual | Logical | Physical |
---|---|---|---|
(Main) Audience | Business | Designer | Programmer |
Entity Names | ✔ | ✔ | |
Entity Relationships | ✔ | ✔ | |
Attributes | (✔) | ✔ | |
Cardinalities | ✔ | ✔ | |
Primary Keys | ✔ | ✔ | |
Foreign Keys | ✔ | ✔ | |
Column Data types (Domain) | (✔) | ✔ | |
Table Names | ✔ | ||
Column Names | ✔ |
The conceptual data model is (in theory at least) independent of the choice of database technology.
Remember that in relational models, relations were representing entities (Student
) and relationships (Majors_In
). At the conceptual level, and more particularly in ER diagram, the distinction is made between entities and relationship.
Data is organized into entities (with attributes), relationships between entities (with attributes as well).
Entity A:
Some vocabulary:
Attributes can be
{…} = multi-valued
(…) = complex
For instance, one could
{Address(Street, Number, Apt, City, State, ZIP)}
,A key attribute is an attribute whose value is distinct for each entity in the entity set.
If the attribute is …, | then… |
---|---|
composite | other attributes are connected to it |
multi-valued | the box have double lines |
derived | the box have dotted lines |
a key | the name of the attribute is underlined |
In the following, we’ll focus on the relationship between the entities more than on the attributes of particular entities, so we’ll sometimes simply draw
leaving the attributes un-specified (but that does not mean that they all have to be atomic) or even just
but that does not mean that the entity type have no attribute!
E1, … En participate in R, e1, …, en participate in r1, n is the degree.
Note that we can have Entity Set 1 = Entity Set 2, in which case we say the relation is recursive.
Some sources call the relationships between an entity and itself “unary.” Note that with our convention, it does not make sense to speak of a unary relationship.
Naming convention:
Convenient, and sometimes mandatory, to give role names.
If we want to stress that we are considering only one aspect of an entity type (that is, a person is not only an employee, a company is not only an employer, but this aspect is crucial for the “EMPLOYS” relation):
We can also use it to make the “right-side” and the “left-side” of a recursive relationship explicit:
Finally, we will sometimes use “Role Name of Entity 1 : Role Name of Entity 2” as a notation for the relation between them. For instance, we can write “Employer:Employee” to denote the “EMPLOYS” relation, and we will also use this notation when the relationship is between different entities, and write e.g. “PERSON:POSITION” for the “OCCUPIES” relation.
Two constraints, called “structural constraints”, applies to relationship types: cardinality ratio and participation constraint. They both concerns the number of relationship instances an entity can participate in (which is different from the cardinality of a relationship type).
Maximum number of relationships instances that an entity can participate in.
For binary relations, can be 1 : 1, N : 1, 1 : N, or M : N. The 1 stands for “at most 1”, and the M, N, and P stand for “possibly more than 1”, or “no maximum”. In ER diagram, we do not count, and do not make the distinction between “at most 5” and “at most 10”, for instance.
An alternative notation, detailled later on, will address this shortcoming.
Possible examples include:
Relation | Possible Ratio | Explanation |
---|---|---|
MENTOR : MENTEE | 1 : N | “A mentor can have multiple mentees, a mentee has at most one mentor.” |
PERSON : SSN | 1 : 1 | “A person has one SSN, a SSN belongs to one person.” |
COURSE : DEPARTMENT | N : 1 | “A course is offered by one department, a department can offer any number of courses.” |
STUDENT : TEAM | M : N | “A student can participate in multiple team, a team can have multiple students.” |
We indicate the ratio on the edges:
Note that reflexive relations can have any ratio as well. An example of M : N recursive relation could be:
Minimum number of relationships instances that an entity can participant it, a.k.a. “minimum cardinality constraint.”
The participation can be total (a.k.a. existence dependency, the entity must be in that relationship at least once) or partial (the entity may or may not be in that relationship).
Total is drawn with a double line, partial is drawn with a single line:
This reads “a course must be offered by a department, but a department may or may not offer courses.”
Relationships can have attributes too. The typical example is a date attribute, but other examples include
Note that an attribute on a relationship type can be atomic or composite, single or multi-valued, stored or derived, but that it cannot be a key attribute (after all, there are no entity to identify!).
Note that there are some moving aspects here: atributes on 1 : 1, 1 : N, N : 1 relationships can be migrated (to the N side when there is one, or to either side where there is none).
For instance, imagine that every phone uses exactly (= “at most and at least”) one carrier, that a carrier can provide network to multiple phones, and that the average quality of the network is an attribute in this relationship:
Then each instance of the relation would be of the form (“Phone X”, “Carrier Y”, “9/10”) for some way of ranking the average quality from 0 to 10. Note that, from the fact that the relationship is N : 1, this means that there is only one tuple involving “Phone X”: this means that the average quality could actually be seen as a property of the phone, and hence be migrated as an attribute to the phone side:
Note that we could not migrate the “average phone quality” to the “Carrier” side: imagine if we had the instances (“Phone X”, “Carrier Y”, “9/10”) and (“Phone Z”, “Carrier Y”, “3/10”), then should the attribute of “Carrier Y” be “9/10” or “3/10”: we have no way of deciding based on this model. Whenever it is a good choice to migrate this attribute or not will depend on the requirement of the models, and it may not always be appropriate to migrate the attribute to the entity. In the case of 1 : 1 relationship, migrating the attribute to both sides (i.e., to both entities) would be a mistake, since it would introduce redundancy in your model.
As an exercise, you can look at the relationships TEACHING, MENTORING and EMITED_DRIVING_LICENCE that are listed above, and see if the attributes can be migrated or not, and if yes, on which side.
Of course, relationships can have a degree higher than two. An example of a ternary relation could be:
To determine cardinality ratio, one should fix all but one parameters, and wonder how many values of the remaining parameter can be in that relationship. Another wording for the same idea can be found in this thread.
Four our example, Customer Y and Bank Z could be in relationship with more than one account (hence the “N”). On the opposite, Customer Y and Account K would be in relationship with only one bank (hence the “1” on the bottom), and Bank Z and Account K would belong to only one customer (hence the “1” on the left).
Let us look at two other examples. First, assume we want to collect information about the treatment prescribed by physicians to patients, we could use a relationship like the following one:
Where
Now, if we want to store information about who is the president of a country during a term, we could get something like:
Note that this representation of the data assumes that a citizen cannot be the president of two different countries during the same term (the right 1), which could be debatable.
It is sometimes impossible to do without relations with arity greater than 2. For instance, consider the following two diagrams
You should realize that they convey different information. For instance, you can know for a fact that a person visit a bookshop only if they bought something in it, while the second diagram de-correlate the act of buying with the visit to a bookshop. Similarly, the second diagram could give you a hint that a person that owns a copy of a book Z and visits a bookshop X that sells it could also visit it, but you will not know that for sure.
An example of recursive ternary relation could be:
An example of relation of degree 4 could be:
The cardinality ratio are computed using the same method as described before.
There are actually two sorts of entity types:
Weak (or child) entity types are identified by identifying / owner type that is related to it, in conjunction with one attribute (the partial key). That relation is called identifying (or supporting) relationship, and weak entities have a total participation constraint. The partial key is an attribute, that, when paired with an entity with which they are in relation through their identifying relationship, allows to identify a particular entity.
Weak entities and identifying relationships have a double border, and partial key have a dotted underline, as follows:
The idea here is that we do not need to gather data about all the dependent in the world, or in isolation, but are interested in dependent only if they are related to en employee in our database. Just having the name of a dependent is not enough to identify them, but having their name and the SSN of the employee they are related to is enough. The identifying relation always have ratio 1 : M or 1 : 1: a weak entity cannot be related to more than one entity of the owner type, so that M : N ratio are not possible (cf. e.g. StackExchange). If you need to have, for instance, a dependant connected to multiple employees, then that means that your dependent entity should be strong, because it has an existence “of its own”.
You may wonder why we do not represent weak entities simply as (composite, multi-valued) attributes of their owner type. For instance, why would we use
instead of
? The answer depends whenever we need to have the ability to represent our weak entities (here, PET) as being in relationship with other entities (that can themselves be weak!), as follows:
This would be impossible if PET was an attribute of FRIEND! Whenever the pet entity type is involved in other relationships or not should help you in deciding which representation to choose.
Another example of weak entity whose owner is weak as well could be:
The idea being that the Health care provider cares about an insure only if they are covered by them, and that they care about the doula only if they are currently helping one of their insure.
Multiple notations have been used to represent the ratio and constraint on relationship.
In the following, we introduce two of them: the Min/Max and the Crow’s foot notations.
The two constraints can be written on the same side, and the N, M, P ratio can be replaced by actual number, providing more information.
For instance,
could be drawn as
meaning that
More generally, we have the following:
Extended (or Enhanced) ER Models (EER) have additionaly:
Closer to object-oriented programming.
It is possible to go from relational models to ER models, and sometimes needed: if you are given an implementation that seems poorly design, this can be a way of “backing up” and thinking about the (sometimes implicit) choices that were made during the implementation, to eventually correct them.
For instance, consider the code we studied in “A First Example”:
CREATE TABLE STORM ( NAME VARCHAR(25) PRIMARY KEY, Kind ENUM ("Tropical Storm", "Hurricane"), WindSpeed INT, Creation DATE ); -- We can change the enumerated datatype: ALTER TABLE STORM MODIFY Kind ENUM ("Tropical Storm", "Hurricane", "Typhoon"); CREATE TABLE STATE ( NAME VARCHAR(25) UNIQUE, Postal_abbr CHAR(2) PRIMARY KEY, Affected_by VARCHAR(25), FOREIGN KEY (Affected_by) REFERENCES STORM (NAME) ON DELETE SET NULL ON UPDATE CASCADE );
It corresponds to the following relational model:
which in turn corresponds to the following ER diagram:
Looking at this diagram made it obvious that our code has a flaw: a stom can affect more than one state! Turning the 1 on the left-hand side of the “AFFECTS” relationship into a M is immediate on the diagram, but, of course, mapping it back to a relational model, and then implementing it correctly, will require more work. In any case, if you had not noted already this flaw, reverse-engineering this code highlighted it quite clearly.
If we look back at Problem 3.5 (Revisiting the PROF table), we had already made a first step, since we converted the code into the following relational model:
Going a bit further, we could extrapolate just a little bit and get the following ER diagram:
As we noted in our solution to the second question, this model has several limitations. To list a few, this representation can not handle the following situations:
Looking at it as an ER diagram should help you in understanding why we have those flaws, and how they could be addressed, and “testing” the model should be made easier in its ER form than as SQL code.
We have to map all of the following:
Using four tools: Relations, Attributes, Primary Keys, Foreign Keys.
We will use three techniques to represent some of the relationships, the foreign key approach, the merged relations approach and the cross-reference approach. They are detailed and illustrated after the algorithm, which goes as follows:
# | is mapped to | |
---|---|---|
1 | Strong Entity | Relation with all the simple attributes. Decompose complex (composite) attributes. Pick a key to be the PK, if it is composite, take its elements. |
2 | Weak Entity | Relation with all the simple attributes. Decompose complex attributes. Add as a foreign key the primary key of the relation corresponding to the owner entity type, and make it a primary key, in addition to the partial key of the weak entity. If the owner entity type is itself weak, start with it. |
3 | Binary 1 : 1 Relationship Types | Foreign Key, Merge Relations or Cross-Reference approach |
4 | Binary 1 : N Relationship Types | Foreign Key or Cross-Reference approach |
5 | Binary M : N Relationship Types | Cross-Reference approach |
6 | n-ary Relationship Types | Cross-Reference approach |
7 | Multi-valued Attributes | Create a new relation, add as a foreign key the primary key of the relation corresponding to the original strong entity type. Make all the attributes be the primary key. |
whose primary key is the foreign key to the relation corresponding to the entity.
NOT NULL
constraint on the attribute that is not part of the primary key anymore.Every time a relationships have attributes, they are mapped to the resulting relation.
Let us look in more details at some of those steps. For strong entities, using steps 1 and 7, the following:
would give:
And note that if Serial was a complex attribute, we would just “unfold” it, or decompose it, and make all the resulting attributes the primary key of the relation. If one of the attribute was at the same time multi-valued and composite, as follows:
Then we would obtain:
For relationships, things are a bit more complicated. Consider the following:
Since it is a 1 : 1 relationship where one of the side has a partial constraint, we have the choice between two approaches. The foreign key approach would give:
Note that we could also have added the foreign key on the side of ENT.B, referencing the key of ENT.A. But since ENT.A has a total participation constraint, we know that the value of FK will always exist, whereas some entities in ENT.B may not be in relationship with an entity from ENT.A, creating the (nefast) need for NULL
values.
For the same diagram, the cross-reference approach would give:
Similarly, note that, in MAPPING, KeyB, or KeyA and KeyB, would also be valid primary keys, but that it makes more sense to have KeyA being the primary key, since we know that ENT.A has a total participation constraint, but ENT.B does not.
If both participation constraints were total, as follows:
Then we could use the merged relations approach, and get:
We picked KeyA to be the primary key for the same reason as before. Note that merging the two entities into one relation also means that you have eventually to do some work on the relations that were referring to them.
Of course, if ENT.A and ENT.B are the same entity (that is, REL is recursive), we would get:
or
depending on the approach we chose.
Binary 1 : N and binary M : N relationships are dealt with in a similar way, using foreign key or cross-reference approaches. The most difficult part of the mapping is with n-ary relationships: we have to use cross-reference approaches, but determining the primary key is not an easy task. Consider the following.
This developement was actually asked at Stack Exchange.
The arity constraints here can be rephrased as:
And note that there is no total participation constraint.
To reprent the RESERVES relationship, we need to create a relation with attributes referencing the primary key of MEMBER, the primary key of TIME_SLOT, and the primary key of EQUIPMENT. Making them all the primary key does not represent the fact that the same equipment cannot be booked twice during the same slot, nor that a member can book only one equipment per slot, but allows members to reserve a particular equipment at multiple time slots. To improve this situation, we can either
Both solutions enforce only some of the requirement expressed by the ER diagram.
ER Model | Relational Model |
---|---|
Entity type | Entity relation |
1 : 1 or 1 : N relationship type | Foreign key (or relationship relation) |
M : N relationship type | Relationship relation and two foreign keys |
n-ary relationship type | Relationship relation and n foreign keys |
Simple attribute | Attribute |
Composite attribute | Set of simple component attributes |
Multivalued attribute | Relation and foreign key |
Value set | Domain |
Key attribute | Primary key |
You can have a look at e.g. Holowczak to get a slightly different explanation of this conversion, and additional pointers.
What makes a good database? At the logical (conceptual) and physical (implementation) levels.
Goals:
SELECT
and select-project-join easy)For ER diagrams, some of the usual techniques are:
Cf. for instance Stanford Infolab.
1 relation corresponds to 1 entity or 1 relationship type
(Bad!) Example:
---------- (Login, Name, AdvisoryName, AdvisorOffice, Major, MajorHead) -----------(Office, PhoneNumber, Building)
NULL
Should Be RareNULL
has 3 meanings, wastes space, and makes join / nested projections harder.
Example:
STUDENT(Login, …, siblingEnrolled)
Transform into “Emergency Contact in University” relation (bonus: allow multiple contacts).
Example with advisorOffice and Office: if we try to write a join to obtain the phone number of a student’s advisor, we will obtain all the phone.
MARKER(Owner, Color, OwnerOffice, Brand, BrandEmail) TEACHER(Office, Name, Phone)
Corrected to:
MARKER(Owner, Color, B͟r͟a͟n͟d͟) TEACHER(Office, N͟a͟m͟e͟, Phone) BRAND(N͟a͟m͟e͟, Email)
Functional dependencies (FD) is a formal tool used to assess how “good” a database is, a property of the relation schema. Functional dependencies list the constraints between two sets of attributes from the database. For instance, if X and Y are (sets of) attributes, X → Y reads “X fixes Y”, and implies that the value(s) of Y is fixed by the value(s) of X.
“What should be.”
Let us list all the attributes of our previous example:
MARKER.Owner, MARKER.Color, MAKER.Brand, TEACHER.Office, TEACHER.Name, TEACHER.Phone, BRAND.Name, BRAND.Email
Think about their dependencies, and list them:
TEACHER.Name → TEACHER.Office
BRAND.Name → BRAND.Email
TEACHER.Office → TEACHER.Name
TEACHER.Office → TEACHER.Phone
MAKER.Owner and MARKER.Color → MARKER.Brand
?“What is.”, can disprove some of the assumptions made previously, but should not add new dependencies based on it (they may be by chance!).
TEACHER.Office → TEACHER.Name
does not hold, because teachers share offices?TEACHER.Name → MARKER.Brand
and MARKER.Color
seemed to be enforced by the state, but we should not add a functional dependency based on that: there are no “requirement” that a Teacher must always buy the same brand and color, this could simply true be by chance so far and should not be imposed to the teachers.A particular state cannot enforce a FD, but it can negate one.
Example:
Att. 1 | Att. 2 | Att. 3 |
---|---|---|
Bob | 15 | Boston |
Bob | 13 | Boston |
Jane | 12 | Augusta |
Emily | 12 | Augusta |
May hold | Will not hold |
---|---|
Att. 2 → Att. 3 | Att1 → Att2 |
Att. 3 → Att. 2 | Att. 3 → Att. 2 |
Att. 1 → Att. 3 | Att. 2 → Att. 1 |
{Att. 1, Att. 2} → Att. 3 | {Att. 3, Att. 2} → Att. 1 |
Or, more conveniently:
If an attribute is a foreign key to another, we will draw an arrow between relations:
Note that:
We will assume that the consequence of those axioms always hold (“closure under those rules”), but will generaly not write them explicitely, since they do not carry any new or additional information.
Remember superkey (not minimal key), key, candidate key, secondary key? We now have a formal definition.
In one particular relation R(A1, …, An),
Given a FD {A1, …, An} → Y,
A FD : X → Y is a transivive dependency if there exist a set of attribute B s.t.
First, Second, Third, Fourth, Fifth normal form (“X”NF). Stronger than the Third, there is the Boyce-Codd NF (BCNF)
If you satisfy N, you satisfy N − 1, N − 2, etc. The normal form of a relation is the highest normal form condition that it meets.
The domain of all attributes must be atomic (simple, indivisible): exclude multi-valued and composite attributes.
Sometimes, additional requirement that every relation has a primary key. We will take this requirement to be part of the definition of 1NF, but some authors take a relation to be in 1NF if it has at least candidate keys (i.e., multiple possible keys, but no primary key, which makes their definition more general, cf. (Elmasri and Navathe 2015, 14.4.1)). Hence, we will always assume that a primary key is given, and it will be underlined.
To be written
1NF + Every non-prime attribute is fully functionnaly dependent on the primary key.
For each attribute A of the relation whose primary key is A1, …, An:
becomes
Refinement: note that if more than one attribute depends of the same subset {A′1, …, A′k}, we will create two relations: that is useless, we could have created just one. For instance, considering
applying the algorithm would give
whereas a more subtle algorithm would give
Note that in both cases, all the relations are in Second Normal Form, though.
Note also that, sometimes, removing the “original” relation may be preferable: cf. an example in Problem 4.27 (COFFEE relation: primary key and normal form).
Note also that if our primary key is a singleton, then there is nothing to do, we are in 2NF as soon as we are in 1NF.
2NF + no non-prime attribute is transitively dependent on the primary key.
For each attribute A of the relation whose primary key is A1, …, An:
We can have a look at another example:
Note that {State, Driver_Licence_Num}, would be a valid primary key for this relation, and that adding it would make it a relation in 1NF.
As we can see, the name “Driver” is somehow counter-intuitive, since the relation also carries information about Governors. This relation is actually not in 2NF, because the FD {State, Driver_Licence_Num} → Governor is not fully functional. A possible way to fix it is to get:
As you can see, the 2NF helped us in separating properly the entities.
An example of a relation that is in 2NF but not in 3NF could be:
As we can see, all the non-prime attributes are fully functionally dependent from Login, which is our primary key. But, obviously, one of this dependecy is transitive, and breaks the 3NF. A way to fix it is:
As we can see, 3NF also helped us in separating properly the entities, in a slightly different way.
In conclusion, we can observe that every FD X → Y s.t. X is a proper subset of the primary key, or a non-prime attribute, is problematic. 2NF is a guarantee that every entity has its own relation, 3NF is a way to avoid data inconsistency.
One approach for analysis, design, implementation and deployment of databases and their applications. Databases interact with multiple softwares and users, we need a common language.
Unified Modeling Language is a standard:
Wide, powerful, but also intimidating.
You know UML from object-oriented programming language:
That is an example of a class diagram (with class name, attributes and operators, as well as a particular way to represent that a class extends another) , there are other types of diagrams, they are not unrelated! For instance, using communication diagrams, deployment diagrams, and state chart diagrams, you can collect the requirements needed to draw a class diagram! They each offer a viewpoint on a software that will help you in making sure the various pieces will fit together: it is a tool commonly used in software engineering, and useful in database design.
There are 14 different types of diagrams, divided between two categories: structural and behavioral.
They describe structural, or static, relationships between objects, softwares.
In this category also exist Composite structure diagram, Package diagram and Profile diagram.
They describe the behavioral, or dynamic, relationship, between components.
Then there is the sub-category of “Interaction diagrams”:
This sub-category also comprise Timing diagram and Interaction overview diagram.
Looking at the “COMPANY conceptual schema in UML class diagram notation”, and comparing it with the “ER schema diagram for the COMPANY database” from the textbook, can help you in writing your own “Rosetta Stone” between ER and UML diagram. Let us introduce some UML terminology for the class diagrams.
UML | ER |
---|---|
Class | Entity Type |
Class Name | Entity Name |
Attributes | Attributes |
Operations (or Method) | Sometimes Derived Attributes |
Association | Relationship Type |
Link | Relationship Instance |
Multiplicities | Structural Constraint |
As well as for ER diagram, the domain (or data type) of the attributes is optional. A composite attribute in a ER diagram can be interpreted as a structured domain in a UML diagram (think of a struct), and a multi-valued attribute requires to create a new class.
Associations are, to some extend, more expressive than relationship types:
min
.. max
, with *
for “no maximum”, and the following shorthands: *
stands for 0
..*
and 1
stands for 1
..1
. An association with 1
on one side and *
on the other (resp. 1
and 1
, *
and 1
, *
and *
) is sometimes called “one-to-many” (resp., “one-to-one”, “many-to-one”, “many-to-many”). The notation in partially inverted w.r.t. ER diagrams:Additionally, associations can be “extended”, and they are not the only kind of relationship that can be expressed between two classes.
Qualified associations can be used for weak entities, but not only.
Some of those subtleties depend on your need, and are subjective, but are important tool to design properly a database, and relieving the programmer from the burden of figuring out many details.
Name the three high-level models we will be learning about in this class (expand the acronyms).
What could be the decomposition of an attribute used to store an email address? When could that be useful?
Draw the ER diagram for a “COMPUTER” entity that has one multivalued attribute “Operating_System”, a composite attribute “Devices” (decomposed into “Keyboard” and “Mouse”) and an “ID” key attribute.
Draw the ER diagram for a “CELLPHONE” entity that has a composite attribute “Plan” (decomposed into “Carrier” and “Price”), an “MIN” (Mobile Identification Number) key attribute, and a multi-valued “App_Installed” attribute.
Name one difference between a primary key in the relational model and a key attribute in the ER model.
What is a derived attribute? Give two examples and justify them.
Invent an entity type with at least one composite attribute and one atomic attribute, but no multi-valued attributes. Identify a possible key attribute and draw the entity type you obtained using the conventions we used in class.
What is the degree of a relationship type?
What is a self-referencing, or recursive, relationship type? Give two examples.
What does it mean for a binary relationship type “Owner” between entity types “Person” and “Computer” to have a cardinality ratio of M : N?
What are the two possible structural constraints on a relationship type?
Draw a diagram to represent a relationship type R between two entities types A and B such that:
Express the constraints represented in the following diagram in plain English.
Express the constraints represented in the following diagram in plain English.
For the following binary relationships, suggest cardinality ratios based on the common-sense meaning of the entity types.
Entity 1 | Cardinality Ratio | Entity 2 |
---|---|---|
STUDENT | : | MAJOR |
CAR | : | TAG |
INSTRUCTOR | : | LECTURE |
INSTRUCTOR | : | OFFICE |
COMPUTER | : | OPERATING_SYSTEM |
Give an example of a binary relationship type of cardinality 1 : N.
Give an example of a binary relationship type of cardinality N : 1 and draw the corresponding diagram (you do not have to include details on the participating entity types).
Draw an ER diagram with a single entity type, with two stored attributes, and one derived attribute. In your answer, it should be clear that the value for the derived attribute can always be obtained from the value(s) of the other attribute(s).
Draw an ER diagram expressing the total participation of an entity type “BURGER” in a binary relation “CONTAINS” with an entity type “INGREDIENT”. What would be the cardinality ratio of such a relation?
Under what condition(s) can an attribute of a binary relationship type be migrated to become an attribute of one of the participating entity types?
Suppose a “PRODUCES” relationship with an attribute “Amount” exists between a “PRODUCER” entity type and a “MOVIE” entity type, with ratio 1 : M. Migrate the “Amount” attribute to one of the entity types and draw the resulting diagram.
Suppose a “MEMBERSHIP relationship with an attribute”Level" (e.g., “silver”, “platinium”, etc.) exists between a “PERSON” entity type and a “CLUB” entity type, with ratio M : 1. Migrate the “Level” attribute to one of the entity types and draw the resulting diagram.
Assume with have three entity types, “Lecture Notes”, “Class” and “Professor.”
You can specify role names in your diagrams for added clarity, and remember to list all the constraints.
Can we always replace a ternary relationship with three binary relationships? Give an example.
What is the difference between an entity type and a weak entity type?
What is a partial key?
Why do weak entity type have a total participation constraint?
Invent a weak entity type, its identifying (owner) entity type and the identifiying (or supporting) relationship. Both entities should have (partial) key, and each should have at least one composite attribute.
Convert the following ER diagram into a relational model:
What is insertion anomaly? Give an example.
What is deletion anomaly? Is it a desirable feature?
Why should we avoid attributes whose value will often be NULL
? Can the usage of NULL
be completely avoided?
Consider the following relation:
PROF(S͟S͟N͟, Name, Department, Bike_brand)
Why is it a poor design to have a “Bike_brand” attribute in such a relation? How should we store this information?
Consider the following relation:
STUDENT(S͟S͟N͟, Name, …, Sibling_On_Campus)
Why is it a poor design to have a “Sibling_On_Campus” attribute in such a relation? How should we store this information?
Consider the following relational database schema:
STUDENT(L͟o͟g͟i͟n͟, Name, …, Major, Major_Head)
DEPARTMENT(C͟o͟d͟e͟, Name, Major_Head)
Assuming that “Major” is a foreign key referencing “DEPARTMENT.Code”, what is the problem with that schema? How could you address it?
Why can we not infer a functional dependency automatically from a particular relation state?
Consider the relation R(A, B, C, D, E, F) and the following functional dependencies:
For each set of functional dependency, give a key for R. We want a key, so it has to be minimal.
Consider the relation R(A, B, C, D, E, F) and the following functional dependencies:
A → {D, E}, D → {B, F}, {B, E} → A, {A, C} → {B, D, F}, A → F
Answer the following:
What is a composite attribute in a ER diagram? Can a relational schema with composite attribute be in Second Normal Form?
Consider the relation R(A, B, C, D) and answer the following:
Consider the relation R(A, B, C, D, E, F) with candidate keys {A, B} and C. Remember that, in all generality, to be a prime attribute, you just need to be part of a possible candidate key. Answer the following:
Consider the relation R(A, B, C, D, E) and the following functional dependencies:
For each one, give one candidate key for R.
Consider the relation R(A, B, C, D, E) and answer the following:
Consider the relation R(A, B, C, D, E, F), and let {B, D} be the primary key, and have additionnaly the functional dependencies {A, D} → E, C → F. This relation is not in 3NF, can you tell why?
Consider the relation R(A, B, C, D) and answer the following:
Consider the relation R(A, B, C, D, E) and the functional dependencies {A, B} → C, B → D, C → E. Answer the following:
What are the two different categories of UML diagram?
Can a C++
developer working on Linux and a Java
developer working on MacOS use the same class diagram as a basis to write their programs? Justify your answer.
What kind of diagram should we use if we want to …
Name two reasons why one would want to use a UML class diagram over an ER diagram to represent a conceptual schema.
Consider the following diagram:
Give the number of attributes for both classes, and suggest two operations for the class that does not have any. Discuss the multiplicities: why did the designer picked those values?
Convert the following ER diagram to a UML class diagram.
Briefly explain the difference between an aggregation and a composition association.
How is generalization (or inheritance) represented in a UML class diagram? Why is such a concept useful?
Convert the following ER diagram into a UML class diagram:
The three high-level models we will be learning about are the Unified Modeling Language, Entity Relationship, and Enhanced Entity–Relationship models.
A useful decomposition of an email address attribute could be: the username part before the @ sign, and the domain part afterwards. It might be useful to have statistics about the domains of the users or to sort the usernames by length, etc.
There can be more than one key in the ER model, but it has to be made of a single attribute, whereas a primary key can be made of multiple attributes.
A derived attribute is an attribute whose value can be determined by the value of other attributes. For instance:
- The value of an “Age” attribute could be determined from the value of an “Date of birth” attribute and the current day.
- The value of a “State” attribute can be determined from the value of a “Zip code” attribute.
- The value of a “Body Mass Index” attribute could be calculated from the values of height and weight attributes.
- The value of an “Initials” attribute could be determined using the values of the “First Name”, “Middle Name”, and “Last Name” attributes.
The degree of a realationship type is the number of its participating entity types.
A self-referencing relationship type is where the same entity type participates more than once. On a SEATS entity type, it would be an attribute like “is to the left of” or on a PERSONS entity type, it would be and attribute like “is married to”.
The cardinality ratio on the binary relationship type “Owner” between the entity types “Person” and “Computer” means that a person can own multiple computers, and a computer can have multiple owners.
The two possible structural constraints on a relationship type are the cardinality ratio and participation constraints.
We would obtain the following diagram:
A key opens only one door, and every key must open at least one door. A door can be opened by multiple keys, and some doors may not be opened by any key (think of doors that do not have a lock).
The binary relation type “is the Chair of” with a cardinality ratio of 1:N between entity types “Professor” and “Department” means that a department can have at most one professor as its chair, but that a professor can be the chair of multiple departments. It could make sense to require that every department has a chair, hence writing a double line between the Department entity and the “is the Chair of” relationship, but it would not make sense to have a total participation constraint on the side of the professor (which would mean that every professor has to be the chair of a department).
An operating system may be supported by many computers, but it is also possible that no computer supports it (think of an operating system in development, or developed for embeded devices). A computer must support at least one operating system and can support multiple operating systems.
Entity 1 | Cardinality Ratio | Entity 2 | Explanation |
---|---|---|---|
STUDENT | N : 1 | MAJOR | “A student has one major, but multiple students can have the same major” |
CAR | 1 : 1 | TAG | “A car has exactly one tag, a tag belongs to one particular car.” |
INSTRUCTOR | 1 : N | LECTURE | “An instructor can teach multiple lecture, but a lecture is taught by only one person.” |
INSTRUCTOR | 1 : N | OFFICE | “An instructor can have multiple office, but an office belongs to only one instructor” |
COMPUTER | M : N | OPERATING_SYSTEM | “A computer can have multiple operating system, the same operating system can be installed on more than one computer.” |
Some of these choices are debatable (typically, almost any combination seems reasonable for the INSTRUCTOR : OFFICE relation).
A binary of relationship of SUPERVISOR as a recursive relationship on EMPLOYEE.
An attribute of a binary relationship type can be migrated to one of the participating entity types when the cardinality ratio is 1 : N, 1 : 1, or N : 1. It can be migrated “to the N side” or, if there is no N side, to either side. Note that for n-ary relationships, at least one ratio needs to be 1 for the attribute to be allowed to migrate (and “to the N side”, or, if there is no N side, to any side).
We could have the following:
We could have the following:
A possible example of ternary relationship is:
One example of two binary relationships could be:
A question like
“Who wrote the lecture notes X?”
could be answered with the binary relationships but not the ternary. Conversely, a question like
“What are the lecture notes refered to by Prof. X in their class Y?”
could not be answered using the binary relationships (since we do not know what classes are taught by Prof. X).
No, a ternary relationship cannot always be replaced by three binary relationship. For instance, if I have a “Travelling to” relationship between a “Person”, a “City” and a “Transport mode”, to represent the fact that a person is travelling to a city using a particular mode of transportation, there is no way I can convey the same information using binary relationships.
The weak entity type does not have a key attribute, it cannot be distinguised from the other weak entities based on a single attribute, for that we also need to know its relationship to some other entity type.
For a weak entity attribute, it is the attribute that can uniquely identify weak entites that are related to the same owner entity.
Otherwise, we could not identify entities in it without owner entity.
A possible solution is:
Note that the two composite attributes are “generic”, in the sense that you can re-use those examples easily.
A possible option is:
Note that “Stays_At” could also be a separate relation, with two attributes, “Address” and “Person”, linked to respectively PLACE.Address and PERSON.SSN, and both being the primary key of the relation.
When you have to invent a primary key or add a lot of NULL value to be able to add a tuple. I want to add a room in my DB, but the only place where rooms are listed are as an attribute on a Instructor table, so I have to “fake” an instructor to add a room.
A delete anomaly exists when certain attributes are lost because of the deletion of other attributes. It is not desirable, since it can lead to the loss of information.
Because they waste space, they are ambiguous (N/A, or unknown, or not communicated?), and they make querries harder. No, it is necessary sometimes.
Because it will be NULL
most of the time. In a separate relation, e.g. a “BIKE” relation, with two attributes, “Owner” and “Brand”, “Owner” being a foreign key referencing the SSN attribute of PROF.
Because it will be NULL
most of the time, and because students could have more than one sibling on campus. In a separate relation, e.g. in a “EMERGENCY_CONTACT” relation, with two attributes, “Student” (refercing the SSN attribute of STUDENT), and “Contact”. If the emergency contacts are not related to the student, or if we want to preserve the fact that one student is a sibling to another, we can create another relation to store that information.
Major_Head will give update anomalies. By putting the Head of the department in the DEPARTMENT relation only, i.e., removing it from STUDENT.
Just because a coincidence exists (i.e., “in my data set, no android user is color-blind”) does not mean that it will always be true (i.e., “no color-blind person will ever use android”). Functional dependencies should come from a principled reasoning about the attributes, and not from the observation of the data.
A composite attribute is an attribute made of multiple attributes, like an “Address” attribute could be composed of the “sub”-attributes “Street”, “City”, “Zip” and "State. A relational schema needs a primary key and to have only atomic domains to be in first normal form, so, no, a relational schema with composite attributes can not be in second normal form.
{B, D} → C → F breaks the 3NF.
The two different categories of UML diagram are behaviour and structure.
Yes, UML diagram is language-independent and platform-independent.
To use direction for association, to have a common language with someone less knowledgeable of other diagrammatic notations. For the concept of integration.
Flight
has 5 attributes, Plane
has 4. The Plane
class could have the operations getLastFlightNumber() : Integer
and setMaximumSpeed(MPH) : void
.
For the multiplicities: A flight could not have a plane assigned, and a plane could not be assigned to a flight. A plane can be assigned to multiple (or no) flights, but a flight must have at most one plane (and could have none).
The absence of total participation constraint on the left side of the diagram may seem odd: what would be a hand not belonging to a person? Still, we have to accept it: we do not know what the requirements are, or the precise nature of the entities. As far as we know “hand” could refer to a card game, and “person” could refer to players. A straightforward representation of the same diagram as a UML class diagram could be:
Note that we could convey more information, for instance by using aggregation, or even composition, but, without more information about those entities and this relationship, it may be safer not to make any additional supposition.
Aggregation: associated class can have an existence of its own.
Composition association: class does not exist without the association.
Because it avoids redundancy.
Your professor designed the following relational model, at some point in his career, to help him organize his exams and the students’ exam grades:
Table Name and Attributes | Example of Value |
---|---|
EXAM(Number, Date, Course) | < 1, ‘2018-02-14’, ‘CSCI3410’> |
PROBLEM(Statement, Points, Length, Exam) | < ‘Your professor designed…’, 10, ‘00:10:00’, 1> |
STUDENT_GRADE(Login, Exam, Grade) | < ‘aalyx’, 1, 83> |
EXAM.Number
, PROBLEM.Statement
, STUDENT_GRADE.Login
and STUDENT_GRADE.Exam
are all the primary key, and STUDENT_GRADE.Exam
and PROBLEM.Exam
are foreign keys that both refer to EXAM.Number
.
The idea was to have the following design elements:
EXAM
table for storing information about exams.PROBLEM
table for storing each problem as its’ own entry and to associate every problem to an exam.STUDENT_GRADE
table for storing the grade of one student for one particular exam.Unfortunately, this design turned out to be terrible.
Consider the ER schema for the MOVIES database (inspired from (Elmasri and Navathe 2010, Figure 7.24)):
Where the attributes are omitted, and separate entities are created for actors, producers and directors even if they happen to be the same person (to deal with e.g. pseudonyms or different attributes, like agent or address).
Given the constraints shown in the ER schema, respond to the following statements with True or False. Justify each answer.
Draw the ER diagram for the following situation:
One of the interesting choices is: should “accident” be an entity type or a relationship type?
You want to design a database to help you apply for jobs and to compare offers. Every job has a salary range, a title, multiple requirements (like languages known, years of experience, etc.) and was advertised by a company at a particular url. Every company has a physical and numerical address, provides some benefits (assuming they provide the same benefits to all their employees). Sometimes you know one or multiple persons working there, and you want to keep track of their names, role, and (if this is the case) of the job they told you about. Finally, you want to keep track of the offers you received: the job they correspond to, the actual salary offered and the possible starting date.
A company wants to develop a database to keep track of the programmers, projects and programming languages they know of. They are not willing to store guidelines for the sake of it, but believe that if a project requires a particular guideline (like, which IDE to use, what spacing convention they use, etc.), it should be stored somewhere. They want to accommodate the fact that a project can use multiple programming languages (and sometimes even multiple versions of the same language), and keep track of which programmer is leading which project. To ease “match making”, they also want to track which programmer is knowledgeable of what programming language. They would also like to store links to the specifications of programming languages, as well as urls of the projects and their guidelines.
They came up with the following ER diagram:
This diagram, to your expert eyes, has multiple flaws, missing constraints, and has some inconsistencies with their requirements. List as many as you can, and suggest improvments or solution when you can think of one.
Look at the following relational model and “reverse-engineer” it to obtain an ER diagram:
In this problem, we will install and explore the basic functionalities of MySQL Workbench, which is a cross-platform, open-source, and free graphical interface for database design.
Apply the ER-to-Relation mapping to your ER diagram from Problem 4.3 (ER diagram for car insurance).
Consider the following ER diagram:
Using this diagram, answer the following:
Is it true that … | Yes | No |
---|---|---|
… a customer cannot drop two bikes at the exact same time and date? | ||
… two different customers cannot drop two different bikes at the exact same time and date? | ||
… an employee cannot repair two bikes at the same time? | ||
… a customer can be assigned to more than one employee? | ||
… a customer can have a bike repaired by an employee that is not assigned to him/her? | ||
… a bike can be in the database without having been dropped by a customer? | ||
… an employee can be asked to repair a bike without having that type of bike as one of their specialties? |
Convert that ER diagram into a relational model. Try to make as few assumptions as possible.
Consider the following ER diagram:
Using this diagram, answer the following:
Is it true that … | Yes | No |
---|---|---|
… a label can have multiple logos? | ||
… a recording can be released by multiple labels and at different dates? | ||
… a record shop can have multiple exclusivities? | ||
… two record shops can have the same address? | ||
… two logos can have the same name? | ||
… two recordings can have the same title? | ||
… a record shop must sell at least one recording? |
Convert that ER diagram into a relational model. Try to make as few assumptions as possible.
Problem 4.11 (ER-to-Relation mapping for Country)
Consider the following ER schema:
where
For this relationship, on the left-hand side is the language that borrows a word and on the right-hand side is the language that provides the loanword.
Map that ER diagram to a relational database schema.
Consider the following requirements for a UNIVERSITY database used to keep track of students’ transcripts.
Consider the following relation and its functional dependencies:
CAR_SALE(Car_no, Date_sold, Salesman_no, Commission, Discount_amt)
{Car_no, Salesman_no} | → | {Date_sold, Commission, Discount_amt} |
Date_sold | → | Discount_amt |
Salesman_no | → | Commission |
and let {Car_no, Salesman_no} be the primary key of this relation.
Consider the following relation:
REL(A, B, C, D, E)
Suppose we have the following dependencies:
A | → | D |
{A, B} | → | C |
D | → | E |
Consider the following relation:
SCHEDULE(Period_Start, Period_End, Date, Room, Building, Organizer, Length)
And the following dependencies:
{Period_Start, Date} | → | {Room, Period_End} |
{Period_Start, Length} | → | Period_End |
{Period_Start, Period_End} | → | Length |
{Period_End, Length} | → | Period_Start |
{Date, Period_Start} | → | Organizer |
Room | → | Building |
Consider the following relation:
FLIGHT(From, To, Airline, Flight#, Date_Hour, HeadQuarter, Pilot, TZDifference)
A tuple in the FLIGHT relation contains information about an airplane flight: the airports of departure and arrival, the airline carrier, the number of the flight, its time of departure, the headquarter of the company chartering the flight, the name of the pilot(s), and the time zone difference between the departure and arrival airports.
Normalize the “FLIGHT” relation to its third normal form. You can indicate your steps, justify your reasoning, and indicate the foreign keys if you want to, but you do not have to.
This problem asks you to convert business statements into dependencies. Consider the following relation:
BIKE(Serial_no, Manufacturer, Model, Batch, Wheel_size, Retailer)
Each tuple in the relation BIKE contains information about a bike with a serial number, made by a manufacturer, with a particular model number, released in a certain batch, which has a certain wheel size, and is sold by a certain retailer.
This problem asks you to convert business statements into dependencies. Consider the following relation:
ROUTE(Name, Direction, Fare_zone, Ticket_price, Type_of_vehicle, Hours_of_operations)
A tuple in the ROUTE relation contains information about a public transportation route: its name (e.g. “Gold”, “Green”, …), its direction (e.g., “Medical Campus”, “GCC”, …), the fare zone where the route operates (e.g., “Zone 1”, “Zone 2”, …), the price of a ticket, the nature of the vehicles assuring the route (e.g., “subway”, “bus”, …) and the time of operations (e.g., “24 hours a day”, “from 0600 to 2200”, etc.).
Consider the following business statement:
We want to represent the market of Internet Service Providers (ISP). Each ISP offers multiple bundles, that have a maximum bandwith and a price. Some ISP uses the same name for their bundles (e.g. “premium”, or “unlimited”). Each ISP is given multiple Internet Protocol addresses (IP), and those never change. Every client has a ID that is proper to the ISP (i.e., ISP A and ISP B could both have a client with ID “00001”), an email and subscribes to a particular bundle from a particular ISP. The IP of a client changes over the time.
Consider the relations R and T below and their functional dependencies (as well as the one induced by the primary keys):
R(E͟v͟e͟n͟t͟I͟d͟, E͟m͟a͟i͟l͟, Time, Date, Location, Status) |
T(I͟n͟v͟n͟o͟, Subtotal, Tax, Total, Email, Lname, Fname, Phone) |
{EventId, Email} | → | Status |
EventId | → | {Time, Date, Location} |
Invno | → | {Subtotal, Tax, Total, Email} |
→ | {Fname, Lname, Phone} |
Normalize the relations to 2NF and 3NF. Show all relations at each stage (2NF and 3NF) of the normalization process.
Consider the following relation for published books:
BOOK(Book_title, Book_type, Author_name, List_price, Author_affil, Publisher)
Suppose we have the following dependencies:
Book_title | → | { Publisher, Book_type } |
Book_type | → | List_price |
Author_name | → | Author_affil |
Consider the following relation for deliveries:
DELIVERY(Shipment, PackageNumber, RecipientName, Weight, DriverName, DriverPhone, RecipientPhone)
Suppose we have the following functional dependencies:
Shipment | → | DriverName |
PackageNumber | → | Shipment |
PackageNumber | → | {RecipientName, RecipientPhone} |
PackageNumber | → | Weight |
DriverName | → | DriverPhone |
Answer the following three questions:
Consider the relation
CONTACT(Phone, Call_center, Email, Zip, Brand, Website)
and the following functional dependencies:
{Zip, Brand} | → | {Phone} |
{Brand} | → | {Email} |
{ Brand} | → | {Website} |
{Phone} | → | {Call_center} |
Assume that {Zip, Brand} is the primary key. Normalize this relation to the second normal form, and then to the third normal form. Give the relations, their primary keys, and functional dependencies for both steps.
This exercise asks you to convert business statements into dependencies. Consider the following relation:
MESSAGE(SenderId, Time, Date, ReceiverId, Content, Length, Attachment, Size)
A tuple in the MESSAGE relation contains information about a text message: its sender, the time and date when it was sent, the receiver, the content, the length (in characters), the attachment, and the size (in bytes).
Normalize the following relation to the third normal form.
Do not forget to indicate all the primary keys in your relations.
Consider the relation
CONSULTATION(Doctor_no, Patient_no, Date, Diagnosis, Treatment, Charge, Insurance)
with the following functional dependencies:
{Doctor_no, Patient_no, Date} | → | {Diagnosis} |
{Doctor_no, Patient_no, Date} | → | {Treatment} |
{Treatment, Insurance} | → | {Charge} |
{Patient_no} | → | {Insurance} |
Consider the relation
COFFEE(Origin, Type_Of_Roast, Price, Roasted_Date, Best_Before, Color, Customer, Rating)
with the following functional dependencies:
{Origin, Type_Of_Roast} | → | Price |
{Origin, Type_Of_Roast, Customer} | → | Rating |
{Origin, Type_Of_Roast, Roasted_Date} | → | Color |
Roasted_Date | → | Best_Before |
Assume that all the attributes are atomic and answer the following.
A network card (NIC) has a manufacturer, a model, and a unique serial number (MAC address). It offers one or multiple network technologies (ethernet, wi-fi, bluetooth, etc.), and can be connected to the motherboard using one or multiple connections (PCI connector, FireWire, usb, etc.).
Consider the following ER schema for the CAR_INFO database:
Note that a car can have at most one driver, N passengers, N insurances, and that the car insurance entity exists only if it is “tied up” to a car (i.e., it is a weak entity, and its identifying relationship is called “Insured”).
You are asked to design a database for a network of libraries.
Each library has a name, an address (made of a number, a street, and a zip), and have copies of documents available to borrow and to reserve. A document is of a particular kind (book, video, or disk), has a title, and an internal catalog number (that can be the ISBN, a barcode, etc.). There can be multiple copies of a document in the network, and each copy has a particular unique code. A copy of a document always “belongs” to a particular library, even when it is checked out.
Furthermore, you want to be able to add the patrons in your database. A patron has a name, a unique library card number, and an email. A patron can reserve (put a hold on) multiple copies of documents for up to two weeks, and can borrow multiple copies of documents for one week if it is a video or a disk, and one month if it is a book. Of course, a copy can be borrowed by only one patron, but it can be put on hold for one patron while being borrowed.
This problem requires you to have successfully completed Pb 4.7 and Pb 4.33.
Using the relational database schema you obtained in Pb 4.33, write the SQL implementation of that database. Then, using MySQL Workbench, use the “Reverse Engineering” function to obtain an EER diagram of your database and compare it with the UML diagram from Pb 4.33. Apart from the difference inherent to the nature of the diagram (i.e., UML vs EER), how else are they different? How are they the same? Is the automated tool as efficient and accurate as you are?
This exercise asks you to convert business statements into dependencies. Consider the following relation:
KEYBOARD(Manufacturer, Model, Layout, Retail_Store, Price)
A tuple in the KEYBOARD relation contains information about a computer keyboard; its manufacturer, its model, its layout (AZERTY, QWERTY, etc.), the place where it is sold, and its price.
Write each of the following business statements as a functional dependency:
Based on those statements, what could be a key for this relation?
Assuming all those functional dependencies hold, and taking the primary key you identified at the previous step, what is the degree of normality of this relation? Justify your answer.
Consider the UML diagram below, and convert it to the relational model. Do not forget to indicate primary and foreign keys.
A possible solution is:
Note that CONTACT
could be a weak entity with the identifying relationship being either DISCUSSED_BY
or EMPLOYS
, but both have disadvantages: they would not allow a contact to discuss more than one offer or to be hired by more than one company.
Among the numerous flaws, come to mind:
Is it true that … | Yes | No |
---|---|---|
… a customer cannot drop two bikes at the exact same time and date? | ✔ | |
… two different customers cannot drop two different bikes at the exact same time and date? | ✔ | |
… an employee cannot repair two bikes at the same time? | ✔ | |
… a customer can be assigned to more than one employee? | ✔ | |
… a customer can have a bike repaired by an employee that is not assigned to him/her? | ✔ | |
… a bike can be in the database without having been dropped by a customer? | ✔ | |
… an employee can be asked to repair a bike without having that type of bike as one of their specialties? | ✔ |
1:M
relationships that are not identifying, we can choose between the foreign key and the cross-reference approaches. If we use the former, we obtain:Is it true that … | Yes | No |
---|---|---|
a label can have multiple logos? | ✔ | |
a recording can be released by multiple labels and at different dates? | ✔ | |
a record shop can have multiple exclusivities? | ✔ | |
two record shops can have the same address? | ✔ | |
two logos can have the same name? | ✔ | |
two recordings can have the same title? | ✔ | |
a record shop must sell at least one recording? | ✔ |
The CAR_SALE relation is in 1st normal form, since it has a primary key, and by assuming that all the attributes are atomic. This relation is not is 2nd Normal Form: since Date_sold → Discount_amount and Salesman_no → Commission, then some attributes (namely Discount_amount and Commission) are not fully functional dependent on the primary key. Hence, this relation cannot be in 3rd normal form either.
To normalize,
2NF:
Relations | Functional Dependencies |
---|---|
Car_Sale1(Car_no, Date_sold, Discount_amt) | Car_no → {Date_Sold, Discount_amt} and Date_Sold → Discount_amt |
Car_Sale2(Car_no, Salesman_no) | Car_no → Salesman_no |
Car_Sale3(Salesman_no, Commission) | Salesman_no → Commission |
3NF:
Relations | Functional Dependencies |
---|---|
Car_Sale1-1(Car_no, Date_sold) | Car_no → Date_Sold |
Car_Sale1-2(Date_sold, Discount_amt) | Date_Sold → Discount_amt |
Car_Sale2(Car_no, Salesman_no) | Car_no → Salesman_no |
Car_Sale3(Salesman_no,Commission) | Salesman_no → Commission |
{Period_Start, Date} would be a suitable primary key.
This relation is already in second normal form: there are no non-prime attributes that are not fully dependent of the primary key. Stated differently, there are no non-prime A such that {Period_Start} → A or {Date} → A.
This relation is not in 3rd normal form. Consider the following relation: {Period_Start, Date} → {Period_Start, Period_End} → Length. {Period_Start, Period_End} is different from {Period_Start, Date} and from Length, and it is not included in a candidate key. The same goes for {Period_Start, Date} → Room → Building.
Once normalized to the third normal form, we get:
The relation we consider is:
ROUTE(Name, Direction, Fare_zone, Ticket_price, Type_of_vehicle, Hours_of_operations)
ISP(ISP, bunle, bandwith, price, IP, ID, email, time)
The functional dependencies suggested by the business statement are:
{ISP, bundle} | → | {bandwidth, price} |
IP | → | ISP |
{ISP, ID} | → | {email, bundle} |
{ISP, ID, time} | → | IP |
We obtain the following four relations when we normalize it to the third normal form:
After normalizing PRINT to the second normal form (by adding the primary key {Author, Title, Size}, and working on dependencies like {Author, Title} → Technique, which does not fully depend of the primary key), we would obtain three relations that are already in third normal form:
CONSULTATION (Doctor_no, Patient_no, Date, Diagnosis, Treatment)
PRICE_LISTING (Treatment, Charge)
The original relation is:
COFFEE(Origin, Type_Of_Roast, Price, Roasted_Date, Best_Before, Color, Customer, Rating)
A suitable primary key would be PKCOFFEE = {Origin, Type_Of_Roast, Roasted_Date, Customer}. Note that it is the minimal and only primary key.
This relation is in first normal form because it has a primary key (the one we just defined), and because all the attributes are atomic. It is not in second normal form, because, for example, the functional dependency PKCOFFEE → Price is not fully functionally dependent, since {Origin, Type_Of_Roast} → Price holds.
Normalizing to the second normal form actually gives us relations in third normal form:
-CLIENT_RATING(Origin, Type_Of_Roast, Customer, Rating)
-PRICING(Origin, Type_Of_Roast, Price )
-EXPIRATION_DATE(Roasted_Date, Best_Before)
-COFFEE_BATCH(Origin, Type_Of_Roast, Roasted_Date, Color)
Where the functional dependencies always are in such a way that all the attributes but the last one fix the value of the last one, and are taken to be the primary key.
Checking that they are all in third normal form is straightforward. Note that the “original” relation was somewhat lost, since we do not have a relation whose primary key is PKCOFFEE anymore. We could have re-introduced a relation with only the attributes of PKCOFFEE to be on the “safe side”, but the benefit would not have been clear
For Car, we need to create an attribute, like VIN. For Car Insurance, Policy Number is the perfect key attribute.
Note that, during the coversion, we had to make Insured Car part of the primary key of CAR INSURANCE.
We give the code first, then the drawing:
/* code/sql/HW_Person.sql */ DROP SCHEMA IF EXISTS HW_Person; CREATE SCHEMA HW_Person; USE HW_Person; CREATE TABLE PERSON ( ID VARCHAR(25) PRIMARY KEY, NAME VARCHAR(25), Street VARCHAR(25), City VARCHAR(25), Seat VARCHAR(25), Position VARCHAR(25) ); CREATE TABLE CAR ( Vin VARCHAR(25) PRIMARY KEY, Make VARCHAR(25), Model VARCHAR(25), Year DATE, Driver VARCHAR(25), FOREIGN KEY (Driver) REFERENCES PERSON (ID) ON UPDATE CASCADE ); ALTER TABLE PERSON ADD FOREIGN KEY (Seat) REFERENCES CAR (Vin); CREATE TABLE CAR_INSURANCE ( Policy_number VARCHAR(25) PRIMARY KEY, Company_name VARCHAR(25), Insured_car VARCHAR(25), FOREIGN KEY (Insured_car) REFERENCES CAR (Vin) ); CREATE TABLE PHONE ( ID VARCHAR(25), Number VARCHAR(25), FOREIGN KEY (ID) REFERENCES PERSON (ID), PRIMARY KEY (ID, number) );