One of the most powerful things about relational databases like mySQL is JOINing tables. Let's discuss writing different types of JOINS in mySQL. We've constructed a mini database for a fictional app called LewbowskiIn, basically LinkedIn, but only for characters in The Big Lebowski. You don't need to watch the movie, but it's recommended.
Our Mini Database
skills
skill_id |
skill_name
|
---|---|
1 | Making White Russians |
2 | Marijuana Cultivation |
3 | Home Security |
4 | Dog boarding |
5 | Janitorial Services |
6 | Blow Jobs |
user_skills
skill_id | user_id |
---|---|
1 | 1 |
2 | 1 |
3 | 2 |
4 | 2 |
6 | 4 |
6 | 5 |
users
user_id | first_name | last_name |
---|---|---|
1 | Jeffrey | Lebwoski |
2 | Walter | Sobchak |
3 | Donny | Kerabatsos |
4 | Bunny | Knutson |
Basic INNER JOIN
The user_skills table maps users to different skills. We want to join the user table to the user_skills table. We do this with the JOIN
statement, also called INNER JOIN
. They are the same thing. The important part of the JOIN statement is the ON clause. If the ON clause isn't included you'll get back every possible combination of the tables you're joining, also known as the cartesian product. In this example we only get back the rows where there is a match in both the users table and the user_skills table.
INNER JOIN Statement
QUERY:
SELECT * FROM users INNER JOIN user_skills ON users.user_id = user_skills.user_id;
Result:
user_id | first_name | last_name | skill_id | user_id |
---|---|---|---|---|
4 | Bunny | Knutson | 6 | 4 |
2 | Walter | Sobchak | 4 | 2 |
2 | Walter | Sobchak | 3 | 2 |
1 | Jeffrey | Lebwoski | 1 | 1 |
1 | Jeffrey | Lebwoski | 2 | 1 |
INNER JOIN Analysis:
The returned results only contain row combinations where there is a matching user_id in each table. Since Walter and Jeffrey have two skills each, their names appear twice. An INNER JOIN
will only return elements that exist IN each table, that's something you need to remember, in case you were wondering what the fuck happened to Donny. Not only is he out of his element, he's also outside the scope of the statement. If we want Donny back we need to do what's called an OUTER JOIN
.
The OUTER JOIN
There are two types of outer joins, the LEFT OUTER JOIN
and
RIGHT OUTER JOIN
LEFT JOIN
and RIGHT JOIN
respectively. An outer join will include rows that don't have a common column value in each table, however it depends how the query is structured. If we do the following RIGHT JOIN
statement we're joining the table on the right (user_skills) to the table on the left (users). Since user_skills does not contain Donny's user_id, he'll still be missing.
RIGHT JOIN Explained
QUERY:
SELECT * FROM users RIGHT JOIN user_skills ON users.user_id = user_skills.user_id;
Results:
user_id | first_name | last_name | skill_id | user_id |
---|---|---|---|---|
1 | Jeffrey | Lebwoski | 1 | 1 |
1 | Jeffrey | Lebwoski | 2 | 1 |
2 | Walter | Sobchak | 3 | 2 |
2 | Walter | Sobchak | 4 | 2 |
4 | Bunny | Knutson | 6 | 4 |
NULL | NULL | NULL | 7 | 8 |
LEFT JOIN Statement:
But what if we want all the users returned, whether or not they have skills? We need a LEFT JOIN statement that will return all rows in the left table (users) whether or not they exist in the right table (user_skills). Now we've got Donny back, nobody really knows what Donny does, so he has no skills.
QUERY:
SELECT * FROM users LEFT JOIN user_skills WHERE users.user_id = user_skills.user_id;
Result:
user_id | first_name | last_name | skill_id | user_id |
---|---|---|---|---|
1 | Jeffrey | Lebwoski | 1 | 1 |
1 | Jeffrey | Lebwoski | 2 | 1 |
2 | Walter | Sobchak | 3 | 2 |
2 | Walter | Sobchak | 4 | 2 |
3 | Donny | Kerabatsos | NULL | NULL |
4 | Bunny | Knutson | 6 | 4 |
We now see Donny in the results with a value of NULL for skill_id and user_id. This is because the user table was joined to user_skills table, so we get all the rows from users no matter what. One thing to note is that the right join could have returned a similar result to the left join if we just flipped the tables like so:
Flipped Tables Statement:
SELECT * FROM users RIGHT JOIN user_skills ON user_skills.user_id = users.user_id;
This statement will return the same results as the previous left join statement.
Double join
You probably noticed those last results weren't super useful because they only returned the skill_id and not the actual skill name. Let's get freaky and create a double join statement.
SELECT * FROM users LEFT JOIN user_skills ON users.user_id = user_skills.user_id INNER JOIN skills ON user_skills.skill_id = skills.skill_id;
This statement joins all three tables so we get a result that also contains the skill name as shown here:
user_id | first_name | last_name | skill_id | user_id | skill_id | skill_name |
---|---|---|---|---|---|---|
1 | Jeffrey | Lebwoski | 1 | 1 | 1 | Making White Russians |
1 | Jeffrey | Lebwoski | 2 | 1 | 2 | Marijuana Cultivation |
2 | Walter | Sobchak | 3 | 2 | 3 | Home Security |
2 | Walter | Sobchak | 4 | 2 | 4 | Dog Boarding |
4 | Bunny | Knutson | 6 | 4 | 6 | Blow Jobs |
Let's clean this up a little by specifying which columns we want returned. Also notice that we need to alias the users table to help us specify the user_id. If we didn't prefix the users table FROM users u
we wouldn't be able to access the user_id column. The user_id column isn't unique, it exists in both the users and user_skills tables so our join statement needs to know which table we are referencing by aliasing the users table with a u.
QUERY:
SELECT u.user_id, first_name, last_name, skill_name FROM users u LEFT JOIN user_skills ON u.user_id = user_skills.user_id INNER JOIN skills ON user_skills.skill_id = skills.skill_id;
Result:
user_id | first_name | last_name | skill_name |
---|---|---|---|
1 | Jeffrey | Lebwoski | Making White Russians |
1 | Jeffrey | Lebwoski | Marijuana Cultivation |
2 | Walter | Sobchak | Home Security |
2 | Walter | Sobchak | Dog Boarding |
4 | Bunny | Knutson | Blow Jobs |
In the double join query think about the statement in two steps, the first join creates a virtual table matching the results of our earlier examples, then the next join statement joins the next table to the one created by the first join. Virtual tables are an important concept to grasp when learning mySQL.
That's it for this lesson! We broke down MySQL Joins, one of the most important concepts to master in mySQL. Stay tuned for a new lesson on mySQL from us!