Before we can eliminate data redundancy by normalizing our database, we must understand the mathematical relationships between the columns in our tables. This is defined by Functional Dependencies (FDs).
A Functional Dependency is a constraint that specifies the relationship between two sets of attributes in a relation from a database.
If $R$ is a relation schema and $\alpha$ and $\beta$ are sets of attributes, then the functional dependency $\alpha \rightarrow \beta$ (read as "$\alpha$ functionally determines $\beta$") holds on $R$ if, for any two tuples (rows) in the table:
Consider a Student table with columns: StudentID, Name, DateOfBirth.
StudentID -> Name: This is a valid functional dependency. If two rows have the exact same StudentID, they must belong to the exact same person, so the Name must be identical.Name -> DateOfBirth: This is NOT a valid functional dependency. Two different students can have the exact same name ("John Smith"), but they might be born on completely different dates.{StudentID, Name} -> Name. This is mathematically obvious and doesn't give us any useful constraints.StudentID -> Name.To systematically discover all functional dependencies in a database, we use Armstrong's Axioms. These are a set of sound and complete inference rules.
Let $X$, $Y$, and $Z$ be sets of attributes.
From the three axioms above, we can prove several other extremely useful rules:
The Closure of a set of attributes $X$ with respect to a set of functional dependencies $F$, denoted as $X^+$, is the set of all attributes that are functionally determined by $X$.
Why is this important? If the closure of $X$ ($X^+$) contains absolutely every attribute in the entire table, then $X$ is a Super Key for that table! If $X$ is minimal, it is a Candidate Key.
Finding attribute closures is the mathematical foundation for proving what the Primary Key of a table should be.