I’d like to share my current favorite SQL “combo,” if you will. The database I use on a daily basis is designed such that there are several main tables that hold IDs and metrics followed by several tables that act as libraries and hold the values that are paired with each ID. Here is a quick look.
If you’re familiar with SQL you’ve probably used the IN operator. IN can be helpful when looking to find data that meets certain criteria. For example, let’s say I wanted to find the number of installs for a client in the U.S. and United Kingdom. Already knowing the CountryIds for the U.S. and United Kingdom, I could write something like:
SELECT SUM(VolumeOfInstalls) FROM ClientData WHERE CountryId IN (248 , 250);
While this is helpful and gets me what I want, the trouble here is that I have to know the CountryIds for the U.S. and United Kingdom before I can write this query. Given the number of breakdowns living in this table alone, it seems impossible to memorize even just a handful of IDs. But do I really need to memorize that? No. This is where a subquery comes into play.
Instead of memorizing or even looking up the CountryIds for the U.S. and United Kingdom, I can write another query that does that for me and then insert that query into the IN. For example:
SELECT SUM(VolumeOfInstalls) FROM ClientData WHERE CountryId IN (SELECT DISTINCT CountryId FROM Tbl_Country WHERE Country IN ('United States' , 'United Kingdom'));
Because the second query returns IDs just as the previous list, it returns the same data. The difference is using a subquery here saves me the time having to lookup IDs and instead, focus on the data I need. This is of course a fairly basic example but the potential for expansion and complexity is definitely here.
If you’re looking for a video that walks through a bit more, the Khan Academy has a great video that can be found here.
Welp, this is the end for my first post. Please let me know if this is helpful or if I can expand on anything here as I’m open to any and all feedback. Thanks!