Introduction
Normalization is Nothing but efficiently organizing the data in SQL Server.Its very easy to understand when data in a table is simple and well-organised That is what Normalization do.
Structuring of data can be of 2 types.
- Normalization
- De-Normalization
See the below image (Fig-1).
Fig-1 |
Lets Started
In this article we will discuss "Normalization" and why this is important.Normalization is nothing but simplifying the records , in other we can say we make a table as simple as possible by removing the odd one. So that its very easy to understand the purpose of the each table.
Here we will discuss most widely used Normal Forms(NF-1,NF-2,NF-3).
Lets take an example of the below table. The table consists of Employee data as well as their department details.
We can see the EName column contains the name of the employee which consists of more than one employee in a single record. e.g. - RAJU,ABHI. Which is not the correct way to represent because which may cause problem while complex calculation in future.
So each employee should have their individual data.
TABLE1
EName
|
Salary Monthly
|
Salary Yearly
|
DEPT NAME
|
RAJU,ABHI
|
12000
|
144000
|
HR
|
AKU,RAVI,RAKESH
|
10000
|
120000
|
FINANCE
|
RAJ
|
20000
|
240000
|
EDUCATION
|
SANU
|
11000
|
132000
|
IT
|
MUNU
|
9000
|
108000
|
ADMIN
|
FIRST NORMALIZATION FORM(NF-1)
As per the 1st Normal form a table can not hold multiple values . It should have atomic values.
See the below table(TABLE2) EName contains the atomic values(single values) unlike the TABLE1.
Eid
|
EName
|
Salary Monthly
|
Salary Yearly
|
DEPT NAME
|
1
|
RAJU
|
12000
|
144000
|
HR
|
2
|
AKU
|
10000
|
120000
|
FINANCE
|
3
|
RAJ
|
20000
|
240000
|
EDUCATION
|
4
|
SANU
|
11000
|
132000
|
IT
|
5
|
MUNU
|
9000
|
108000
|
ADMIN
|
6
|
ABHI
|
12000
|
144000
|
HR
|
7
|
RAVI
|
10000
|
120000
|
FINANCE
|
8
|
RAKESH
|
10000
|
120000
|
FINANCE
|
But we can see the above table contains Lots of redundant data . Lets normalize the table better with the help of 2ND NORMALIZATION FORM.
SECOND NORMALIZATION FORM(NF-2)
It says
- It should fulfill 1st Normalization form .
- Non-key columns should be fully and functionally dependent on the primary key of the table.
In TABLE2 see the Dept Name column is not directly dependent on the primary key. So we break the tables into two tables as below.
- Employee
- Department
Department
Dept id | |
1
|
HR
|
2
|
FINANCE
|
3
|
EDUCATION
|
4
|
IT
|
5
|
ADMIN
|
Employee
eid
|
EName
|
Salary Monthly
|
Salary Yearly
|
Dept id
|
1
|
RAJU
|
12000
|
144000
|
1
|
2
|
AKU
|
10000
|
120000
|
2
|
3
|
RAJ
|
20000
|
240000
|
3
|
4
|
SANU
|
11000
|
132000
|
4
|
5
|
MUNU
|
9000
|
108000
|
5
|
6
|
ABHI
|
12000
|
144000
|
1
|
7
|
RAVI
|
10000
|
120000
|
2
|
8
|
RAKESH
|
10000
|
120000
|
2
|
As Department is not directly dependent to the Employee table, we break the table table into two parts as above and associated the two table by using Dept id.
The above table Satisfy NF-2.
The above table Satisfy NF-2.
Here we can see still some redundant data are available in Employee table. Lets Understand how we can get rid of it by using Third Normalization Form(NF-3)
THIRD NORMALIZATION FORM(NF-3)
This says
- It should follow NF-1 and NF-2.
- It should not have dependency between the non key columns.
As the Employee table above satisfy both NF-1 and NF-2 , but not NF-3 we need to check how can we satisfy NF-3 by modifying the table.
Lets take Salary Monthly and Salary Yearly. Both are non key columns and both are related. We can say both are directly proportional to each other.
Salary Monthly=Salary Yearly*12
if we have one column we can calculate the other without any problem.
So its better to remove "Yearly Salary".
We can find a table as below.
Employee
eid
|
EName
|
Salary Monthly
|
Dept id
|
1
|
RAJU
|
12000
|
1
|
2
|
AKU
|
10000
|
2
|
3
|
RAJ
|
20000
|
3
|
4
|
SANU
|
11000
|
4
|
5
|
MUNU
|
9000
|
5
|
6
|
ABHI
|
12000
|
1
|
7
|
RAVI
|
10000
|
2
|
8
|
RAKESH
|
10000
|
2
|
Department
Dept id | |
1
|
HR
|
2
|
FINANCE
|
3
|
EDUCATION
|
4
|
IT
|
5
|
ADMIN
|
The Above both the table satisfy NF-3. These table will satisfy NF-1,NF-2,NF-3.
Points to remember while using Normalization
Performance will decrease when Joining takes place.
Need to create lots of key to achieve this .
Need to divide single table to multiple table.
Thanks................