Don't let the Lockdown slow you Down - Enroll Now and Get 3 Course at 24,999/- Only. Explore Now!

General

Normalization in SQL

Normalization in SQL

Normalization in SQL

In this tutorial, you will learn about data normalization in SQL. Normalization is actually a database design method that arranges the tables in a database with reduced dependency and redundancy of data. Normalization splits up the bigger tables to smaller ones and integrated them through relationships. Normalization improves data integrity. If you fail to use normalization, you could end up facing anomalies namely insertion, update, and deletion. Insertion anomalies happen to suppose if we couldn’t insert data into the table without another attribute’s availability. Update anomalies are actually an inconsistency in the data which could lead to data redundancy and incomplete data update. Deletion anomalies happen if you lose some attributes because of deleting other attributes.

Simply, the organization of data in the DB is called data normalization. Normalization actually demands the organization of columns and tables present in a DB to make sure that their dependants were correctly administered by the DB integrity constraints. It provides more efficiency because it splits up a bigger table to smaller ones.

Purpose of Normalization

As we all know, SQL is a language that is used to communicate with the DB. Any communication of data in the database has to be initiated and that must be normalized. Otherwise, you will end up in anomalies. It will improve data distribution as well. Normalization can be achieved by using normal forms. The normal forms we are going to learn are:

  • 1 NF (First Normal Form)
  • 2 NF (Second Normal Form)
  • 3 NF (Third Normal Form) and
  • Boyce Codd NF

Let’s see one by one with examples.

1 NF (First Normal Form)

We investigate the atomicity problem in 1 NF. In this context, atomicity implies that the values present in the table should not be divided or split up further. Simply, one cell could not carry several values. It is considered as a violation in 1 NF if a table holds a multiple value attribute. For example, have a look at the table below:

Student Admission NoStudent NameMobile NumberOutstanding Fees
1PRI001Aravindan-967890047625,000
-9556678854
1PRI002Darshan-98877653411,000
1PRI003Saravanan-944335669833,000
1PRI004Ramkumar-634567881050,000
-8667890476

Evidently, you can notice that the phone number column contains more than one value and thus, it is a violation in 1 NF. If we apply 1 NF, the table will automatically get normalized (arranged) like as follows:

Student admission noStudent NameMobile NumberOutstanding Fees
1PRI001Aravindan-967890047625,000
1PRI001Aravindan-955667885425,000
1PRI002Darshan-98877653411,000
1PRI003Saravanan-944335669833,000
1PRI004Ramkumar-634567881050,000
1PRI004Ramkumar-866789047650,000

As per the above table, you could visualize every column with distinct values and thus we achieved atomicity using 1 NF.

2NF (Second Normal Form)

In the case of 2 NF, the basic need for satisfying 2 NF is that the table must be present in 1 NF and there should not be any partial dependency, which means the actual subset of the candidate key decides the attribute which is non-prime. Let’s look at an example to understand 2 NF better!

Student admission noClass Room numberClassroom Name
1PRI001South-A1Blackberries
1SEC001South-A4Avocado
2PRI001South-A2Jingle bells
2SEC001South-A5Craneberries

normalized (arranged) as follows:

The above table contains a composite primary key namely Student admission number and Classroom number. Here, Classroom location is a non-key attribute evidently. This Classroom location will depend on the Classroom number, which is actually a part of the primary key. Thus, the above table is a violation of 2 NF. In order to change the above table to 2 NF, we have to divide the table into two portions as follows:

Student admission noClass Room number
1PRI001South-A1
1SEC001South-A4
2PRI001South-A2
2SEC001South-A5
Student admission noClass Room number
1PRI001South-A1
1SEC001South-A4
2PRI001South-A2
2SEC001South-A5

I hope, you could visualize that the partial dependency has been removed in the second table by applying 2 NF. So, the column Class Room Name entirely depends on the table’s primary key, i.e Class Room Number.

3NF (Third Normal Form)

In the case of 3 NF, it follows the same way that 2 NF functions. Here, the table must be present in 2 NF before working with 3 NF. Also, a transitive dependency is not allowed in 3 NF for non-prime attributes. This implies that the non-prime attributes which do not contain a candidate key will not depend on the rest of the non-prime attributes in a table. We can conclude transitive dependency is an indirect functional dependency, i.e A→C (which means A determines C) in which A→B and B→C (but the inverse is not valid i.e B→A is invalid) Let’s get a clear understanding of 3 NF with the following example:

Employee IDEmployee NameDepartment IDDepartmentLocation
1SW15TE01Sarath15TE01TestingHyderabad
1SW15BE01Ramesh15BE01SQLChennai
1SW15DE01Raj15DE01DotnetKochi
1SW15DE02Kumar15DE02JavaBengaluru

Looking at the above table, we can understand that the Employee ID determines Department ID and Department ID determines the department. Thus, Employee ID determines Department via Department ID. This proves that we accomplished transitive function dependency. But, the above structure violates 3 NF because it does not satisfy the rules of 3 NF. So, we have to divide the tables as below:

Employee IDEmployee NameDepartment IDLocation
1SW15TE01Sarath15TE01Hyderabad
1SW15BE01Ramesh15BE01Chennai
1SW15DE01Raj15DE01Kochi
1SW15DE02Kumar15DE02Bengaluru
Department IDDepartment
15TE01Testing
15BE01SQL
15DE01Dotnet
15DE02Java

From the above tables, you could visualize that the entire non-key attributes become completely dependent on the primary key. As in the first table, Employee Name, Department ID and Location depends on Employee ID, whereas in the second table, the Department depends on Department ID.

Let’s move on to the final topic.

Boyce Codd NF (BCNF)

BCNF is also called as 3.5 NF because it is an upgrade of 3 NF. Two researchers Boyce and Codd developed this BCNF concept so as to address some particular anomalies that that doesn’t fall under the 3 NF category. Like other NF techniques, BCNF also has certain conditions to be satisfied. First, BCNF should satisfy 3 NF. In the case of BCNF, if each and every functional dependency, X → Y, then, X will act as the Super key of that specific table.

For example, have a look at the table below:

Stud IDCourse of StudyName of the Professor
1SD17SW01JavaMagesh
1SD17SW02DotnetKarthik
1SD17SW03C++Praba
1SD17SW04DotnetRamesh
1SD17SW05SQLLokesh

As per the above table, we can clarify the following:

  • Any student can select multiple subjects of study
  • You can have multiple teachers to teach one particular subject.
  • For every subject, a teacher has to allocated to the student.

In the above table, except for the BCNF, all other NF techniques were satisfied. Let’s discuss the reason of it. Stud ID and Course of Study provides the primary key. This implies that the Course of Study column is actually a prime attribute. We could see yet another dependency here, i.e Name of the Professor→ Course of Study.

Here, Course of Study is actually a prime attribute whereas the Name of the Professor is a nonprime attribute, which is actually a violation of BCNF. Therefore, to achieve BCNF, we have to separate the table into two portions as Stud ID which is there already and another new column named Prof ID.

Stud IDProf ID
1SD17SW011PF17SW01
1SD17SW021PF17SW02
1SD17SW031PF17SW03
1SD17SW041PF17SW04
1SD17SW051PF17SW05

In the second table, Prof ID, Name of the Professor and Course of Study will be present.

Prof IDName of the ProfessorCourse of Study
1PF17SW01MageshJava
1PF17SW02KarthikDotnet
1PF17SW03PrabaC++
1PF17SW04RameshDotnet
1PF17SW05LokeshSQL

With this, we achieved BCNF. We thus conclude this tutorial about Normalization in SQL. I hope you got a better understanding!

Scroll Up
Besant Technologies WhatsApp