Normalization is the systematic process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to strict mathematical rules designed to protect the data and eliminate redundancy.
When a database is not normalized, it suffers from Update Anomalies:
A relation is in 1NF if and only if the domain of each attribute contains only atomic (indivisible) values, and the value of each attribute contains only a single value from that domain.
The Rule: No arrays, no lists, no repeating groups.
Student table has a PhoneNumbers column storing "555-1234, 555-9876".Student_Phones table.A relation is in 2NF if it is in 1NF and every non-prime attribute is fully functionally dependent on the primary key.
The Rule: Eliminate partial dependencies.
StudentID and CourseID), and a non-prime attribute depends on only part of that key.Enrollment (StudentID, CourseID, CourseName, Grade). The CourseName depends only on the CourseID, not the StudentID. This is a partial dependency.Enrollment (StudentID, CourseID, Grade) and Courses (CourseID, CourseName).A relation is in 3NF if it is in 2NF and there is no transitive dependency for non-prime attributes.
The Rule: Eliminate transitive dependencies. An attribute must not depend on another non-key attribute. (Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key).
Employees (EmpID, Name, DeptID, DeptLocation). Here, EmpID -> DeptID, and DeptID -> DeptLocation. Therefore, DeptLocation transitively depends on EmpID. If the department moves, you have to update the location for thousands of employees.Employees (EmpID, Name, DeptID) and Departments (DeptID, DeptLocation).Industry Standard: In real-world enterprise databases, achieving Third Normal Form (3NF) is generally considered "normalized enough." While higher normal forms exist, enforcing them often degrades read performance too severely.