Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF

The process of producing a simpler and more reliable database structure is called normalization. It is used to create a suitable set of relations for storing data.
 
This process work through different stages known as normal forms. These stages are 
  1. 1NF (First Normal Form)
  2. 2NF (Second Normal Form)
  3. 3NF (Third Normal Form)
  4. 4NF (Fourth Normal Form)
  5. 5NF (Fifth Normal Form)
  6. BCNF (Boyce-Codd Normal Form)
Each normal form has certain requirements or condition. These conditions have to fulfilled to bring the database in that particular normal form.
 
If a relation satisfies the conditions of a normal form, it is said to be in that normal form.
 
The task of database design starts with an unnormalized set of relations. The process of normalization identifies and corrects the problem and complexities of database design.
 
It produces a new set of relations. The new design is as free processing problems as possible
 

Purpose of Database Normalization

The purpose of normalization is as follows:

 

  1. It makes the database design efficient in performance.
  2. It reduces the amount of data if possible.
  3. It makes the database design free of update, insertion and deletion anomalies.
  4. It makes the design according to the rules of relational databases.
  5. It makes a design that allows simple retrieval of data.
  6. It identifies a relationship between entities
  7. It simplifies data maintenance and reduces the need to restructure data.

Characteristics of Database Normalization

A normalized database should have the following characteristics:

 

  1. Each relation must have a key field.
  2. All fields must contain atomic data.
  3. There must be no repeating fields.
  4. Each table must contain information about a single entity.
  5. Each field in a relation must depend on key fields.
  6. All non-key fields must be mutually independent.
Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF

 

Types of Normal Forms

First Normal Form(1NF)

A relation is in first normal form (1NF) if it does not contain a repeating group. A repeating group is a set of one or more data item that may occur a variable number of times in a tuple.
 
The value in each attribute value should be atomic and every tuple should be unique. Each cell in a relationship should contain only one value.
An example of a un-normalized relation is as follows:
Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF
 
The above relation is un-normalized because it contains repeating groups of three attributes Skill Number, Skill Category and Proficiency Number. All three fields contain more than one value.
 
In order to convert this relation in first normal form, these repeating groups should be removed. The following relation is in the first normal form:
 
Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF

Problems in 1NF

The relation in 1NF has a certain problem which are as follows:
 

Updating Problem

Suppose the user wants to change the name of Accountant Number 35 to “M. Daud”.
He has to the name in all records in which Accountant number 35 appears. This process of updating can be very lengthy.
 

Inconsistent Data

The above table may contain inconsistent data. There are three records of Accountant Number 35.
 
It is possible that there are two different names with Account Number 35 in two different records. The user can make this error during updating.
 

Addition problem

Suppose the user wants to add another skill number in the table. It is not possible until an Account with that skill exists because both Skill Number and Accountant Number are used as primary key in the above table.
 

Deletion problem

Suppose the user wants to delete the record of supervisor Ghafoor. If he deletes the whole record in which Ghafoor appears, the information about Accountants will also be lost.
 

Second Normal Form

A relation is in Second Normal Form (2NF) if it is in 1NF and if all of its non-key attributes are fully functionally dependent on the whole key. It means that none of the non-key attributes are related to a part of the key.
 

The above in 1NF has some attributes which are not depending on the whole primary key.
For example, Accountant Name, Accounting Age and Group information is determined by Accountant Number and is not dependent on Skill.

The following relation can be created in which all attributes are fully dependent on the primary key Account Number.

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF


Similarly, another relation Skill can be created in which all fields are fully dependent on the primary key as follows:

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF


The attribute Proficiency in 1NF relation was fully dependent on which whole primary key.

The Proficiency requires to know the accountant number and skill number. The third relation will be created as follows:

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF


There are three relations in second normal form (2NF).
The attributes of all relations are fully dependent on the primary keys,

Analysis of Second Normal Form (2NF)

The following analysis indicates whether the problems are eliminated in 2NF or not.

Updating problem:

If the user needs to change the name of Accountant Number 35 to “M. Daud” in 1NF, he must change the name in every record in which Accountant number 35 appears. But in 2NF, the record of one accountant appears only once. The updating problem is eliminated in 2NF.

Inconsistent Data: 

The record of one accountant appears only once in the database, the possibility of inconsistent data is automatically eliminated.

Addition Problem: 

In 1NF, it was not possible to enter a new Skill Number until an Accountant with that skill existed.

In 2NF, any number of skills can be added in Skill relation without an accountant with that skill. It eliminates the addition problem.

Deletion Problem: 

In 2NF, if the record of Ghafoor is deleted, it does not delete any other record.


The analysis shows that the second normal form has solved all the problem of 1NF.

Third Normal Form

A relation is in third normal form if it is in 2NF and if no non-key attribute is dependent on another non-key attribute.

It means that all non-key attributes are functionally dependent only on the primary key. There should be no transitive dependency in a relation.

In order to convert a relation to 3NF:

  1. Remove all attributes from the 2NF record that depend on another non-key field.
  2. Place them into a new relationship with the other attributes as the primary key.

The Accountant table in 2NF contains some attributes which are depending on non-key attributes.

For example, Group City and Group Supervisor are depending on a non-key field Group Number. A new relation can be created as follows:

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF


The Second table created to form the Accountant Table in 1NF is as follows:

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF


Both the Account table and Group table contain the attribute Group Number. This attribute is used to join both tables.

The Skill table in 2NF contains no attributes, which is depending on non-key attributes. It is already in third normal form and will be used without any further change.

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF


The proficiency table in 2NF also contains no attribute which is depending on the non-key attribute. It is already in third normal form and will be used without a further change.

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF

 

Boyce-Codd Normal Form (BCNF)

A relation is in Boyce-code normal form if and only if every determinant is a candidate key.

It can be checked by identifying all determinants and then making sure that all these determinants are candidate keys.

BCNF is a stronger form of third normal form. A relation in BCNF also in third normal form. But a relation in 3NF may not be in BCNF.

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF


Assume the PartName is unique. It means that no two parts can have the same name.

There are now two candidate keys(projectID, PartID)
and (projectID, PartName).

There are following dependencies:
(projectID, PartID)->QtyUsed
PartID->PartName
PartName->PartID

This relation satisfies 2NF because there are no non-key attributes that are dependent on a subset of the primary key. PartName is not a non-key attribute, it is part of a candidate key. Therefore this relation is in 2NF.

There are no transitive dependencies so the relation is in 3NF. PartID and PartName are ignored by 3NF rule because they are both key attributes.

In order to convert this relation to BCNF, all functional dependencies must be removed which have a determinant that is a candidate key. The result is as follows:

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF

 

Database Normalization-1NF-2NF-3NF-4NF-5NF-BCNF

Fourth Normal Form

A relation is in 4NF If it is in BCNF and has no multi-valued dependencies.

Fifth Normal Form

A relation is in fifth normal form if it has no join dependency. 

Fifth normal form (5NF) is also called project-join normal form(PJNF).

The first four normal forms are based on the concept of functional dependency. The fifth normal is based on the concept of join dependency.



You May Like: