The process of joining two sets of data present in a table and displaying the resultant data in a new table is called as self join. If a data present in a table has NULL value, it will be omitted in the resultant table. A duplicate copy of a table will be created before a self join is applied. Self joins are used to find similarities between the data present in the table.
How Self Join works?
In self join a same table will be combined with itself to find the similarities between two data. While applying self join, a copy of the table will be created. The copied table will be created under a new alias because using the same name for both the table will create a confusion between the columns. A condition is required to apply self join in the table. The self join query for joining the tables to display two players who play similar will look like this.
Example for Self Join
The self join concentrates on the similarities between the two tables. Let us consider that we have a table which contains a list of a players and roles they play in a football team. Our example table will resemble the table given below.
Player
Player_Name | Player_Role |
Messi | Forward |
Suarez | Forward |
Iniesta | Midfielder |
Rakitic | Midfielder |
Pique | Defender |
To create a self join, copies of the table will be created which are aliased with names P1 and P2.
P1
Player_Name | Player_Role |
Messi | Forward |
Suarez | Forward |
Iniesta | Midfielder |
Rakitic | Midfielder |
Pique | Defender |
P2
Player_Name | Player_Role |
Messi | Forward |
Suarez | Forward |
Iniesta | Midfielder |
Rakitic | Midfielder |
Pique | Defender |
The SQL query for joining the table to display the table players who are playing forward is
SELECT P1.Player_name
FROM Player p1, Player p2
WHERE P1.Player_Role = P2.Player_Role
AND p2.Player_Name=”Messi”;
The above query will return a table with player names who play similar role like Messi. The resultant table for the above query is
P1.Player_Name | P1.Player_Role | P2.Player_Name | P2.Player_Role |
Messi | Forward | Messi | Forward |
Suarez | Forward | Suarez | Forward |
If the above query is tweaked to list the defenders, the query will resemble the following
SELECT P1.Player_name
FROM Player p1, Player p2
WHERE P1.Player_Role = P2.Player_Role
AND p2.Player_Name=”Pique”;
If the above query is compiled, the below table will be displayed as output
P1.Player_Name | P1.Player_Role | P2.Player_Name | P2.Player_Role |
Pique | Defender | Pique | Defender |
Only one player is listed in the above player because no other player has similar Player_Role as Pique.
Example for NULL value return in Self Join
Self join will return NULL in the resultant table when some particular field has no similarities but has other fields with similarities.
The above statement can be explained using the relationship between a player and coach in a team. In a team both player and coach can be considered as employees but if playing staff category is added players can be added while coaches are classified as support staff.
According to the example both players and coaches will have employee id. Apart from holding employee id, players will have player id while coaches have coach id. If a comparison using self join is made based on the employee id, all the names will be displayed. The coach id column in player row and player id column in coach row will be declared NULL.
Difference between Self and Outer Join
- Rows are returned in self join only if there is a similarity between the queries.
- Outer join return all inner join data along with data declared as NULL which has no similarities.
- Rows are used to compare in self join while columns are used in comparing outer join.
- Only matching data will displayed in self join. Outer join displays both matching and unmatching datas.
- Only one record will be returned for self join while a record for every matching data will be returned for outer join.