The relational data model (or relational database schema) is:
List_of_major
as an enumerated data type, instead of just String
), enforce some constraints (i.e., UNIQUE
, to force all the values to be different), or even have a default value.NULL
value.NULL
is N/A, unknown, unavailable (or withheld).We now study constraints on the tuples. There are constraints on the scheme, for instance, “a relation cannot have two attributes with the same name”, but we studied those already. The goal of those constraints is to maintain the validity of the relations, and to enforce particular connexions between relations.
Those are part of the definition of the relational model and are independent of the particular relation we are looking at.
Those constraints are parts of the schema.
NOT NULL
, UNIQUE
).NULL
.Those last two constraints will be studied in the next section.
Constraints that cannot be expressed in the schema, and hence must be enforced by
Examples: “the age of an employee must be greater than 16”, “this year’s salary increase must be more than last year’s”.
Since we can not have two identical tuples in the same relation, there must be a subset of values that distinguish them. We study the corresponding subset of attributes.
Let us consider the following example:
A | B | C | D |
---|---|---|---|
Yellow | Square | 10 | (5, 3) |
Blue | Rectangle | 10 | (3, 9) |
Blue | Circle | 9 | (4, 6) |
and the following sets of attributes:
{A, B, C, D} | {A} | {B, C} | {D} | |
---|---|---|---|---|
Superkey? | ✔ | ✘ | ✔ | ✔ |
Key? | ✘ | ✘ | ✔ | ✔ |
Note that here we “retro-fit” those definitions, in database design, they come first (i.e., you define what attributes should always distinguish between tuples before populating your database). We are making the assumption that the data pre-exist to the specification to make the concept clearer.
A foreign key (FK) is a set of attributes whose values must match the value in a tuple in another, pre-defined relation. Formally, the set of attributes FK in the relation schema R1 is a foreign key of R1 (“referencing relation”) that references R2 (“referenced relation”) if
NULL
If there is a foreign key from R1 to R2, then we say that there is a referential integrity constraint from R1 to R2. We draw it with an arrow from the FK to the PK. Note that it is possible that R1 = R2.
NULL
. Note also that all the values must be different, as the same value cannot occur twice as the primary key of tuples: we don’t want to enter the same VIN twice, that would mean we are registering a car that was already registered in our database!NULL
. Furthermore, their pair must be different from all the other values. Stated differently, you can have <GA, 1234>
, <GA, 0000>
and <NC, 1234>
as values for the <State, License-num> pair, even if they have one element in common, what is forbidden is to have both element in common (i.e., you cannot have <GA, 1234>
twice). If both elements were common, that would mean that we are registering a driver that was already in the database.
Yes, we do need the state and the license number to uniquely identify a driver’s license, since many states use the same license format.
NULL
(which it could be), then it has to be a value that occurs as the VIN value of some tuple in the CAR relation. For the Insured-Driver-State and Insured-Driver-License-Num, the situation is similar: they must either both be NULL
, or be values that occurs paired together as the values for State and Licence-Num in a tuple in the CAR relationship. If e.g. Insured-Car was containing the VIN of a car not in the CAR relation, that would mean we are trying to insure a car that is “not known” from the database’s perspective, something we certainly want to avoid.The operations you can perform on your data are of two kinds: retrievals and updates.
There are two constraints for updates:
A transaction is a series of retrievals and updates performed by an application program, that leaves the database in a consistent state.
In the following, we give examples of insertion, deletion and update that could be performed, as well as how they could lead a database to become inconsistent. The annotations (1.), (2.) and (3.) refer to the “remedies”, discussed afterward.
Insert <109920, Honda, Accord, 2012> into CAR
How things can go wrong:
NULL
for any value of the attributes of the primary key (1.)Delete the DRIVER tuple with State = GA and License_number = 123
How things can go wrong:
Update Name of tuple in DRIVER where State = GA and License_number = 123 to Georges
How things can go wrong:
NULL
for the any value of the attributes of the primary key (1.)When the operation leads the database to become inconsistent, you can either:
NULL
, the corresponding value(s).Exercise 2.1
Match the items in Column A to the items in Column B:
Column A | Column B |
---|---|
Row | Attribute |
Column header | Tuple |
Table | Relation |
Exercise 2.2
What do we call the number of attributes in a relation?
Exercise 2.3
At the logical level, does the order of the tuples in a relation matter?
Exercise 2.4
What is the difference between a database schema and a database state?
Exercise 2.5
What should we put as a value in an attribute if its value is unknown?
Exercise 2.6
What, if any, is the difference between a superkey, a key, and a primary key?
Exercise 2.7
Name the two kinds of integrity that must be respected by the tuples in a relation.
Exercise 2.8
What is entity integrity? Why is it useful?
Exercise 2.9
Are we violating an integrity constraint if we try to set the value of an attribute that is part of a primary key to NULL
? If yes, which one?
Exercise 2.10
If in a relation R1, an attribute A1 is a foreign key referencing an attribute A3 in a relation R2, what does this implies about A2?
Exercise 2.11
Give three examples of operations.
Exercise 2.12
What is the difference between an operation and a transaction?
Exercise 2.13
Consider the following two relations:
COMPUTER(Owner, RAM, Year, Brand)
OS(Name, Version, Architecture)
For each, give
Exercise 2.14
Give three different ways to deal with operations whose execution in isolation would result in the violation of one of the constraint.
Exercise 2.15
Define what is the domain constraint.
Exercise 2.16
Consider the following three relations:
For each relation, answer the following:
Exercise 2.17
Consider the following three relations
Insert <'AM-356', 'Surfliner', 2012> into TRAIN
Insert <NULL, 'Graham Palmer', 'Senior'> into CONDUCTOR
Insert <'XB-124', 'GPalmer', '02/04/2018'> into ASSIGNED-TO
Insert <'BTed', 'Bobby Ted', 'Senior'> and <'BTed', 'Bobby Ted Jr.', 'Junior'> into CONDUCTOR
Exercise 2.18
Consider the following relation schema and state:
A | B | C | D |
---|---|---|---|
2 | Blue | Austin | true |
1 | Yellow | Paris | true |
1 | Purple | Pisa | false |
2 | Yellow | Augusta | true |
Assuming that this is all the data we will ever have, discuss whenever {A, B, C, D}, {A, B} and {B} are superkeys and/or keys.
Exercise 2.19
Consider the following relation and possible state. Assuming that this is all the data we will ever have, give two superkeys, and one key, for this relation.
A | B | C | D |
---|---|---|---|
1 | Austin | true |
Shelly |
1 | Paris | true |
Cheryl |
3 | Pisa | false |
Sheila |
1 | Augusta | true |
Ash |
1 | Pisa | true |
Linda |
Exercise 2.20
Consider the following relation and possible state. Assuming that this is all the data we will ever have, give three superkeys for this relation, and, for each of them, indicate if they are a key as well.
A | B | C | D |
---|---|---|---|
1 | A | Austin | true |
2 | B | Paris | true |
1 | C | Pisa | false |
2 | C | Augusta | true |
1 | B | Augusta | true |
Exercise 2.21
Consider the following two relations:
INSERT
and one UPDATE
instruction. Both should violate the integrity of your database.Exercise 2.22
Consider the following two relations:
Solution 2.1
Row is Tuple, Column header is Attribute, Table is Relation.
Solution 2.2
The degree, or arity, of the relation.
Solution 2.3
No, it is a set.
Solution 2.4
The schema is the organization of the database (the meta-data), while the state is the state is the content of the database (the data).
Solution 2.5
NULL
Solution 2.6
A superkey is a subset of attributes such that no two tuples have the same combination of values for all those attributes. A key is a minimal superkey, i.e., a superkey from which we cannot remove any attribute without losing the uniqueness constraint. The primary key is one of the candidate key, i.e., the key that was chosen.
Solution 2.7
Referential integrity and entity integrity.
Solution 2.8
Entity integrity ensures that each row of a table has a unique and non-null primary key value. It allows to make sure that every tuple is different from the others, and helps to “pick” elements in the database.
Solution 2.9
Yes, the entity integrity constraint.
Solution 2.10
Then we know that A2 is the primary key of R2, and that A1 and A2 have the same domain.
Solution 2.11
Reading from the database, performing UPDATE
or DELETE
operations.
Solution 2.12
An operation is an “atomic action” that can be performed on the database (adding an element, updating a value, removing an element, etc.). A transaction is a series of such operations, and the assumption is that, even if it can be made of operations that, taken individually, could violate a constraint, the overall transaction will leave the database in a consistent state.
Solution 2.13
Solution 2.14
An operation whose execution in isolation would result in the violation of a constraint can either a) be “restricted” (i.e., not executed), b) result in a propagation (i.e., the tuples that would violate a constraint are updated or deleted accordingly), or c) result in some values in tuples that would violate a constraint to be set to a default value, or the NULL
value (this last option works only if the constraint violated is the referential entity constraint).
Solution 2.15
The requirement that each tuple must have for an attribute A an atomic value from the domain dom(A), or NULL
.
Solution 2.16
To answer 1 and 2, the diagram would become:
For the last question, the answer is yes: based on the ISSN of the book, we can retrieve the author of the book. Hence, knowing which book was awarded which year, by looking in the GAINED-AWARD table, gives us the answer to that question.
Solution 2.17
NULL
can be given as a value to an attribute that is part of the PK.'XB-124'
and 'GPalmer'
are not values in TRAIN.Ref
and CONDUCTOR.CompanyID
.Solution 2.18
Solution 2.19
Possible superkeys are {A, B, C, D}, {A, B, C}, {A, C, D}, {B, C, D}, {A, B}, {B, C}. The possible keys are {A, B}, {A, C}, and {B, C}.
Solution 2.20
For this relation, {A, B, C, D}, {A, B, C}, and {D} are superkey. Only the latter, {D}, is a key (for {A, B, C}, removing either A or C still gives a superkey).
Solution 2.21
INSERT <"A.H.", NULL>
would violate the requirement not to have two tuples with the same value for the attributes that constitute the primary key in the BUILDING relation. UPDATE ROOM with CODE = 12 to Building = "G.C.C."
would create an entry referencing a name in the BUILDING relation that does not exist.Problem 2.1 (Find a candidate key for the CLASS relation)
Consider the relation:
CLASS(Course_Number, Univ_Section_Number, Instructor_Name, Semester, Building_Code, Room_Number, Time, Weekdays, Credit_Hours)
Here are some examples of values for the attributes:
Attribute | Possible Value |
---|---|
Course_Number | CSCI3410, CSCI1302 |
Building_Code | AH, UH, ECC |
Univ_Section_Number | 1, 2, 3 |
Room_Number | E127, N118 |
Instructor_Name | John Smith, Sophie Adams |
Time | 1400, 1230, 0900 |
Semester | Spring 2015, Fall 2010, Summer 2012 |
Weekdays | M, MW, MWF, T, TH |
Credit_Hours | 1, 2, 3, 4 |
Problem 2.2 (Design a relational model for a cinema company)
A cinema company wants you to design a relational model for the following set-up:
Problem 2.3 (Design a relational model for bills)
Propose a relational model for the following situation:
Problem 2.4 (Relational model for universities)
Propose a relational model for the following situation:
Problem 2.5 (Relational model for an auction website)
We want to design a relational model for an auction website. Members (that can be buyers, sellers, both or neither) can participate in the sale of items.
When creating your schema, do not add any new information, and try as much as possible to avoid relations that will create redundant data and NULL
entries. Note that we should be able to uniquely determine the member account linked to the seller account, and similarly for buyers accounts. Furthermore, members can have at most one buyer and one seller account.
Solution to Problem 2.2 (Design a relational model for a cinema company)
A possible solution is:
Solution to Problem 2.3 (Design a relational model for bills)
Be careful: saying that a bill has a unique sponsor does not imply that a the sponsor is a good primary key for the bills: a house member could very well be the sponsor of multiple bills! It just implies that a single attribute is enough to hold the name of the sponsor.
For simplicity, we added an ID
to our MEMBER
and BILL
relations. Note that having a “role” in the MEMBER
relation to store the information about speaker, etc., would be extremely inefficient, since we would add an attribute to the ~435 members that would be NULL
in ~430 of them.
Solution to Problem 2.4 (Relational model for universities)
A possible solution follows. The part that is the hardest to accomodate is the fact that a course can have multiple codes. We are reading here “cross-listed” as “a course that is offered under more than one departmental heading and can receive different codes (e.g., CSCI XXXX and AIST YYYY)”.