- Read Tutorial
- Watch Guide Video
In this guide, we're going to talk about one of the more basic types of join statements that are out there. Now we've discussed what a JOIN is at a very high level and essentially what it's going to allow us to do is to run a query. But instead of running a query as we've run before, where we do something like this, or we go and say select all from users and run this.
Just to get all of the user data back. This is going to allow us to map our tables together. So if I come here and let's say instead of users, I want our guides data.
This is great. I have all my guide titles, guide revenue, and guides quantity. But this is not going to allow me to do something that would be incredibly helpful, which is to know which one of my users has made this money, written these guides, and those kinds of things.
When it comes to running database queries one of the more common things you're going to need to do is to be able to join those tables together, and that is one of the most important reasons why you SQL in the first place.
If you didn't need to join tables together, then it would be a much faster and easier implementation to use a no SQL kind of databases such as Mongo, Redis, or one of those types of options. Where SQL really has its strength is in joining data elements together. So that's what we're going to walk through here.
In this guide, we're specifically going to talk about the inner join. Now we're going to have a deep dive later on where we talk about all of the differences between inner joins, outer joins, outer left joins, and all of the different kinds of variations that are out there.
As we go through each one of these guides were going to give examples for each one. That should also kind of help you understand the subtle differences between them. So I'm going to get rid of everything here and let's start from scratch.
The very first one I'm going to do is the most basic kind of join you can do which is an inner join. I'm going to say select all from Guides, and I'm going to say, inner join users. And then you have to give this very specific and very important little constraint called "on".
What I'm saying is that I want to join the users table next to guides table and we need to tell SQL how that mapping occurs. Now you remember in our guides table right here. We have a guides_users_id and this allows us to do is to connect it because there is a 1 to 1 mapping between guides_users_id and what we have in the Users table.
Our users table has a set of IDs as one of the columns, and it maps directly here. This is going to be how we set up this relationship. We are going to say on guides.guides_users_id and then users.users_id and this is all we have to do.
If I hit return, this works and now notice we have a very different result set.
We have our guides, so we have our guide id, guide revenue, guide users id, guides title, and guides quantity but that's not it. Now we also have some more information we now can see right here who wrote each one of these guides. We can see that Kristine wrote "my blog", Tiffany wrote "something else", and my great post then Kristine wrote "this guide".
So we now have a direct mapping, and we can see who wrote what guide. Instead of just looking at these integers because in less you have a very very tiny data set you're not going to know by looking at a 1, or 2, or a 5000 who is actually the user behind one of these guides. So this is where the power of the relational query really comes into play. Now, this is the most simple way of doing this. Inner joins are so common that there is actually even a more streamlined way of doing this. If I copy it.
You don't even to say inner join. You can just say join, and you're going to get the same result set. So if I run this query, you can see everything works just like before.
So join is essentially an alias of inner join and so you don't even have to worry about explicitly describing inner join. That's something that's important to note, and it's something that when you're writing your queries usually you're not going to have to put inner join.
Now let's write one more and this one is going to give us some similar results. But as you may notice we have a bunch of data here that really is not necessary. We have a guides id. We also have a user's id, and we have it twice because it's bringing it back and this is contained in both columns.
That's not really necessary especially if you imagine running this query and placing them inside of a CSV report or even in a web application. You may not need these ids so let's write a new query. With this one instead of * like I did right there. Now we're going to make a few changes. We're going to only ask for four items. So we're going to say guides title, guides revenue, users name, and then users email.
I'm going to make one little change. If you remember back when we were talking about aliases, and I told you that you were going to alias tables and when you would want to do that it really came from when you wanted to perform Joines because if you notice on these queries, we had to write guides out and users out multiple times. Well here what we can do is let's use these aliases, so I'm going to say "g." and then "u." here
And I'm also going to put these all on their own line just to make it a little bit more readable.
Once you start to pull in a ton of parameters, it's considered a best practice to put them all on their own line. So now that we have that now, we can say from guides and pass g for the table alias and then say join users and pass in u. On g.guides_users_id. and then u.users_id.
Now if I run this.
This gets run and you can see that it's cleaned up our set. Now we just have a title, revenue, name, and e-mail address. This looks pretty nice and I think this is a better way of doing it. Obviously, this was a report going out to a client or a boss then you could alias each one of these column names as well, and then that way the CSV report will have it. I'm not going to worry about that part right now.
Before we end, I did want to show you one other thing, and that is that we're not limited to only performing queries like this. We also can call other functions and other processes on top of it. So let's say that we want to order this result set by and then we can call one of the columns. So I can say order by g.guides revenue and descending.
Now if I select this and run it.
Now we have a sorted order set in our resultset So now it doesn't just give us our mapping and our joining. In addition to that, it also shows us all of our items all of the records coming in on a sorted basis. And one last thing to leave you with is notice like we've already talked about that the users can be duplicated.
So whenever you are setting the first table, so in this case, we're saying we want everything to start with the guides. When we're doing that, and we're layering the users on top of it. That is going to be what allows the duplicates to occur here and in many cases, this is exactly the behavior you want. But we're also going to talk about other variations in how that order can affect it as we go through the section
Code
USE devcamp_sql_course_schema; SELECT * FROM guides INNER JOIN users ON guides.guides_users_id = users.users_id; SELECT * FROM guides JOIN users ON guides.guides_users_id = users.users_id; SELECT g.guides_title, g.guides_revenue, u.users_name, u.users_email FROM guides g JOIN users u ON g.guides_users_id = u.users_id ORDER BY g.guides_revenue DESC;