This specific topic is the one that you’ll effectively work with on a daily basis. Almost every single SQL query you write will deal with keys, joins, and aliases. For those of you who will perform some data wrangling in R/Python, I’ll show you how you can join tables in those as well.
I’ll be using the titanic dataset for this section. Let’s get it.
Table of Contents:
Keys
Alias
Joins
Exercises & Interview Questions
1 - Keys
1.1 What are SQL keys
Keys are specific types of SQL commands that allow users to identify a record or row within the database with an index or unique identifier.
Keys help ensure no duplicate rows exist within a table and improve query performance due to having fewer scans over the index structures when using keys.
1.2 Types of Keys
Primary Keys: are used to uniquely identify each row or record within a table, which ensures that no two records have the same value for that particular column(s). Let’s say you are looking at a table for BowTied. There is only 1 BowTiedBull, only 1 BowTiedRaptor, only 1 etc…
Secondary/Candidate Keys: There is no real thing as a Secondary key. When someone refers to a column being a secondary key, they usually mean this is a column you will want to use when merging two tables together. An example of a Secondary Key would be the Date column. Although useless on it’s own, it’s great for merging 2 time series tables together.
Foreign Keys: a foreign key is a way to link two tables together. The purpose of this key is to prevent you from writing erroneous entries that should never have been allowed in the first place. In other words, you are allowed allowed to write entries that the foreign key is pointing to.
Here is a great diagram to help visualize this:
![When looking at the artists table, we can see the id is the primary key, and albums is the foreign key. When looking at the artists table, we can see the id is the primary key, and albums is the foreign key.](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2099f974-1f3f-4ce0-a8a2-8c9f7f9bf7dd_1258x446.png)
1.3 Primary Key When Creating Tables
MYSQL: You can use PRIMARY KEY (*column name*) to tell MYSQL that a column is a primary key, when you make a table. For example, if I wanted to create a table called Britanic, from my earlier post.
I’d do this:
CREATE TABLE Brittanic
(Name TEXT,
pclass TEXT,
age NUMERIC,
sex TEXT,
survived INT,
PRIMARY KEY (Name));
This command would make a new SQL table called Brittanic, where Name would be the primary key, in a MySQL database.
PostGres/T-SQL: In the modern SQL databases, we can just say PRIMARY KEY, right after the column name. Obviously, we will also want to add a constraint that says it cannot be Null, by saying NOT NULL PRIMARY KEY. To do the same example above, I’d do this:
CREATE TABLE Brittanic
(Name TEXT NOT NULL PRIMARY KEY,
pclass TEXT,
age NUMERIC,
sex TEXT,
survived INT);
The key icon now shows up next to Name to indicate that this is a primary key. In the real world, it’s possible to run into people who have the same first and last name. Do not use Name as a primary key.
1.4 Foreign Key When Creating Tables.
We can just say FOREIGN KEY(*column name*) REFERNCES *TableName*(*column name*), and just point to the tables and columns respectively.
Here’s an example:
CREATE TABLE Brittanic
(Name TEXT,
pclass TEXT,
age NUMERIC,
sex TEXT,
survived INT,
FOREIGN KEY(Name) REFERENCES Titanic (Name));
Note: For this Foreign Key command to execute successfully, you’ll need to link it to a primary key in another table.
1.5 Seeing the Foreign Key in action
Recall, a foreign key is a way to link two tables together. The purpose of this key is to prevent you from writing erroneous entries that should never have been allowed in the first place.
Let’s take a look side by side at the Brittanic table when it doesn’t have a FOREIGN KEY, and when it does. Click here if you forgot how to insert values.
WITHOUT FOREIGN KEY:
WITH FOREIGN KEY:
When I try to execute the above code, after linking my name column in Britannic to the Name column in Titanic, I get the following error:
Note: In some cases, you’ll need to enable foreign keys first, just execute: PRAGMA foreign_keys = ON;
If you are able to understand keys, and how they work, congrats, you are now able to understand one of the most important parts of a SQL table. Here is a great video that sums up the concept:
2 - Alias
Welcome to the easiest topic in SQL. Alias basically lets us make an Alias for a column that we are working with, or on a Table that we are working with. Let’s say we have a table called BowTied_Raptor_SQL_Part7. Typing that out over and over again is a nightmare, so you can either just hit the space key, and leave an alias, or use the keyword AS to do this.
Here is are 2 examples of making an alias called BTR.
SELECT *
FROM BowTied_Raptor_SQL_Part7 AS BTR
SELECT *
FROM BowTied_Raptor_SQL_Part7 BTR
We can even make Aliases on our columns. Columns can be a bit tricky, because sometimes the name itself will have a space in it. To deal with that, we just use square brackets [ ].
Here’s 2 examples of making an alias called B T for a column called BowTied.
SELECT BowTied AS [B T]
FROM my_table;
SELECT BowTied [B T]
FROM my_table;
In the real world, most people generally take the second version, because who the f wants to keep using the keyword AS over and over again…
3 - Joins
Now, let’s combine the above two topics, and actually use them. When we are working with tables, there are 4 methods of joining the tables:
OUTER JOIN
JOIN (Inner)
LEFT JOIN
RIGHT JOIN
You can look at the visual below to see the difference:
The join that you use is dependent on the problem you are trying to solve. Here is an example of perming on inner join on two tables. The code of this generally looks like:
FROM my_table
JOIN my_other_table
ON my_table.column1=my_other_table.column2
Let’s do an example below:
We’ll say table 1 is called BowTied, and we are using a column called Raptor. We’ll say table 2 is called BTD, and we are using a column called Raptah.
SELECT *
FROM BowTied bd
JOIN BTD bt
ON bd.Raptah=bt.Raptah
If you want to do a different join, you can just swap out JOIN with any of the other 3 options:
OUTER JOIN
LEFT JOIN
RIGHT JOIN
3.1 Merging in R/Python
Python:
To work with tables, we use Pandas (library). If you forgot how libraries work, click here. we would use the pd.merge() function in order to merge two tables together.
The how column indicates which type of join you want to do. The on column says which columns are to be used for joining.
R:
We will use the merge() function in order to merge two tables together. We use the by parameter to indicate which columns are supposed to be used for the merging. We also say all.x, or all.y =True or not to indicate if we still want to keep all of the left/right columns, both, or none. Basically this chooses the type of joins we do.
4 - Exercises/Interview Questions
4.1 - Exercises
Here’s some exercises for ya:
Use the Titanic Table, and make the PClass a Primary Key
Make a new Table called Britannic, and make the PClass a foreign key that links to the Titanic’s PClass
Try to insert a row in the Britannic, where PClass = ‘what up’
Now, insert a row in the Britannic, where Name & PClass = (Birnbaum, Mr Jakob & 1st)
Now do an inner join on the Titanic, and Britannic using Name.
4.2 - Interview Questions
Here’s some Interview questions for ya:
Is a SIN/SSN number a good example of a Primary key? If not, what do you recommend instead?
Ans: No. Not everyone has a SIN/SSN number. People do not want these numbers to be easily visible in a database, and definitely not for merging tables together. A good example would be a unique number/text that is different for every single row for our table.
I have a table called xyz. It has a row called series id. I want to link this series id to a table called abc, so that you cannot have any values in series id that do not match those from abc. How do I achieve this?
Ans: In your table xyz, you will make a FOREIGN KEY that links series_id to whatever column you were after, in table abc.
What’s a JOIN operator in SQL? and mention some of the types
Ans: A JOIN operator in SQL lets you join two tables together. The 4 main types are: INNER, OUTER, LEFT, and RIGHT.
Click here to continue to SQL 8: Data Manipulation in SQL. Basically, it’s all about data skills. You’ll use the content a lot daily, however it won’t be asked much for interviews.
recursive join ha