🗄️SQL & Databases

Normalization

Updated 2026-05-15

10 min read

Normalization

Introduction

Normalization is a systematic approach to organizing the fields and tables of a relational database to minimize redundancy and dependency. It helps in improving data integrity, reducing storage requirements, and ensuring that the data is consistent. This tutorial will guide you through the process of normalization, focusing on its importance and practical applications.

Concept

Normalization involves breaking down large tables into smaller, more manageable pieces and organizing those pieces according to specific rules. The primary goal is to eliminate redundant data and ensure that each piece of information is stored in only one place. This is achieved by applying a series of normal forms, which are progressively stricter sets of rules.

Normal Forms

First Normal Form (1NF): Ensure that each column contains atomic values and that there are no repeating groups.
Second Normal Form (2NF): Achieve 1NF and ensure that all non-key attributes are fully functionally dependent on the primary key.
Third Normal Form (3NF): Achieve 2NF and ensure that there are no transitive dependencies between non-key attributes.

Examples

Let's walk through an example to understand how normalization works in practice.

Example Scenario

Suppose we have a database for a book store. Initially, we might design the following table:

SQL

1CREATE TABLE Books (
2  BookID INT PRIMARY KEY,
3  Title VARCHAR(255),
4  AuthorName VARCHAR(100),
5  AuthorEmail VARCHAR(100),
6  Genre VARCHAR(100)
7);

This table has some redundancy issues, such as storing the author's name and email multiple times if an author writes multiple books.

Applying First Normal Form (1NF)

To achieve 1NF, we need to ensure that each column contains atomic values. In this case, our table already meets 1NF because each column contains a single value.

Applying Second Normal Form (2NF)

To achieve 2NF, we need to remove partial dependencies. A partial dependency occurs when a non-key attribute is dependent on only part of the primary key. In our Books table, the AuthorName and AuthorEmail are not fully functionally dependent on the BookID. Instead, they depend on the AuthorName.

We can resolve this by creating a separate table for authors:

SQL

1CREATE TABLE Authors (
2  AuthorID INT PRIMARY KEY,
3  Name VARCHAR(100),
4  Email VARCHAR(100)
5);

And then updating the Books table to reference the Authors table:

SQL

1ALTER TABLE Books
2ADD COLUMN AuthorID INT,
3ADD CONSTRAINT fk_author FOREIGN KEY (AuthorID) REFERENCES Authors(AuthorID);

Applying Third Normal Form (3NF)

To achieve 3NF, we need to remove transitive dependencies. A transitive dependency occurs when a non-key attribute depends on another non-key attribute. In our current design, there are no transitive dependencies.

However, if we add more attributes like PublisherName and PublisherAddress, we might introduce a transitive dependency:

SQL

1ALTER TABLE Books
2ADD COLUMN PublisherName VARCHAR(100),
3ADD COLUMN PublisherAddress VARCHAR(255);

To resolve this, we can create a separate table for publishers:

SQL

1CREATE TABLE Publishers (
2  PublisherID INT PRIMARY KEY,
3  Name VARCHAR(100),
4  Address VARCHAR(255)
5);

And then update the Books table to reference the Publishers table:

SQL

1ALTER TABLE Books
2ADD COLUMN PublisherID INT,
3ADD CONSTRAINT fk_publisher FOREIGN KEY (PublisherID) REFERENCES Publishers(PublisherID);

What's Next?

After understanding normalization, you might want to explore denormalization. Denormalization is the process of adding redundancy to a database in order to improve performance or simplify queries. It involves combining tables that have been normalized and reintroducing some level of redundancy.

Normalization and denormalization are complementary concepts, and the choice between them depends on the specific requirements of your application, such as performance needs versus data integrity.

By following these steps, you can design a well-structured database that is efficient, reliable, and easy to maintain.

🗄️SQL & Databases

Normalization

Updated 2026-05-15

10 min read

Normalization

Introduction

Concept

Normal Forms

First Normal Form (1NF): Ensure that each column contains atomic values and that there are no repeating groups.
Second Normal Form (2NF): Achieve 1NF and ensure that all non-key attributes are fully functionally dependent on the primary key.
Third Normal Form (3NF): Achieve 2NF and ensure that there are no transitive dependencies between non-key attributes.

Examples

Let's walk through an example to understand how normalization works in practice.

Example Scenario

Suppose we have a database for a book store. Initially, we might design the following table:

SQL

1CREATE TABLE Books (
2  BookID INT PRIMARY KEY,
3  Title VARCHAR(255),
4  AuthorName VARCHAR(100),
5  AuthorEmail VARCHAR(100),
6  Genre VARCHAR(100)
7);

This table has some redundancy issues, such as storing the author's name and email multiple times if an author writes multiple books.

Applying First Normal Form (1NF)

To achieve 1NF, we need to ensure that each column contains atomic values. In this case, our table already meets 1NF because each column contains a single value.

Applying Second Normal Form (2NF)

We can resolve this by creating a separate table for authors:

SQL

1CREATE TABLE Authors (
2  AuthorID INT PRIMARY KEY,
3  Name VARCHAR(100),
4  Email VARCHAR(100)
5);

And then updating the Books table to reference the Authors table:

SQL

1ALTER TABLE Books
2ADD COLUMN AuthorID INT,
3ADD CONSTRAINT fk_author FOREIGN KEY (AuthorID) REFERENCES Authors(AuthorID);

Applying Third Normal Form (3NF)

However, if we add more attributes like PublisherName and PublisherAddress, we might introduce a transitive dependency:

SQL

1ALTER TABLE Books
2ADD COLUMN PublisherName VARCHAR(100),
3ADD COLUMN PublisherAddress VARCHAR(255);

To resolve this, we can create a separate table for publishers:

SQL

1CREATE TABLE Publishers (
2  PublisherID INT PRIMARY KEY,
3  Name VARCHAR(100),
4  Address VARCHAR(255)
5);

And then update the Books table to reference the Publishers table:

SQL

1ALTER TABLE Books
2ADD COLUMN PublisherID INT,
3ADD CONSTRAINT fk_publisher FOREIGN KEY (PublisherID) REFERENCES Publishers(PublisherID);

What's Next?

Normalization and denormalization are complementary concepts, and the choice between them depends on the specific requirements of your application, such as performance needs versus data integrity.

By following these steps, you can design a well-structured database that is efficient, reliable, and easy to maintain.