Module 4: Database design theory and Normalisation

alt text

Design Guidelines

Informal measures of relational database schema quality and design guidelines

  1. Making sure that the semantics of the attributes is clear in the schema.
  2. Reducing the redundant values in tuples.
  3. Reducing the null values in tuples.
  4. Disallowing the possibility of generating spurious tuples.

Formal Design guidelines

Guideline 1: Design each relation so that it is easy to explain its meaning

alt text

Redundant values in Tuples

Example given in lectures

alt text

Also:

level -> Salary

Modification Anomalies

Deletion Anomalies

Insertion Anomalies

Guideline 2: Design the base relation schema so that no insertion, deletion, or modification anomalies occur in the relations

Guideline 3: As far as possible, avoid placing attributes in a base relation whose values may be null

Decomposing and joining a relation

Decomposition

alt text

Before After
alt text alt text

Join operation

Definition: R1R2R1 \bowtie R2 is the natural join of the two relations

alt text

Lossy Join operation The word loss in lossless refers to loss of information, not loss of tuples

alt text

Functional dependencies

Databases allow you to say that one attribute determines another through a functional dependency

Formal Definition

A functional dependency (FD) XYX \rightarrow Y holds on relation R if for every legal instance of RR such as rr, for all tuples t1, t2:

if    t1[X]=t2[X]t1[Y]=t2[Y] \text{if} \; \; t_{1}[X] = t_{2}[X] \rightarrow t_{1}[Y] = t_{2}[Y]

Keys

A key is a minimal set of attributes that uniquely identify a relation

alt text

They have added the level to the field list - since it still uniquely identifies the relation, it is a superkey

Implicit and Explicit FD's

Given a set of (explicit) functional dependencies, we can determine implicit ones

NOTE: this example above is called a transitive dependency, we will learn more about this next week!

Many implicit FDs can be derived

alt text

Closure of X (Finding Candidate Keys)

Closure of XX or X+X^{+} is the set of attributes determined by X under F.

alt text

Will do will this to prove superkeys, or find candidate keys

A note about finding keys

Tips for finding keys:

  1. If an attribute does not appear on the RHS of any FDs in F, a key must contain that attribute
  1. If a subset S is a key, there is no need to test any superset of S
  1. One relation can have multiple keys of different length

More on this in the tutorial

YouTube videos which will help:

Finding a candidate key, simple

Finding a candidate key, more difficult