SQL Case and Temp Tables

What Does the CASE Statement Do?

Case Statements

The SQL CASE Statement is a conditional expression that allows me to create a new temporary column in my result set based on If/Then logic. A CASEstatement is added to the SELECT statement; it begins with CASE and ends with END AS new_field_name. Don't forget the comma after the last item in your SELECT list! This is a common error for me, so I want to point it out.

It's helpful to include an ELSE clause to catch any possible values that fall outside of your conditional clauses because otherwise, if those values exist, they will be returned as NULL.

So What Can I Do with a Case Statement?

  • This allows me to Zoom Out or change the granularity of my data; you may also hear this referred to as reducing the noise in your data.
  • CASE Statements allow me to create a categorical value from a continuous one by binning my data or creating buckets.
  • Using CASE Statements comes in very handy when I want to create new fields from existing data in my dataset using conditionals, but I don't have database permissions to create a new table. Temporary tables are useful in this situation, as well. I'll dig into those next.

Now What?

Create Categories to Classify Values

For example, if I have a field called high_temp containing integers that represent the high temperature recorded for a particular day, I can add a CASE statement to my query that will create a new field called weather classifying values in high_temp as one of three categorical variables, cold, warm, or hot, based on temperature ranges that I set as conditionals.

I can create categorical variables from a numeric one. I can even add a GROUP BY and a COUNT(*) to find out which category in the weather field has the highest count.

USE us_weather_db;
SELECT high_temp,
       COUNT(*) AS days,
       CASE 
           WHEN high_temp > 90 THEN 'hot'
           WHEN high_temp BETWEEN 70 AND 90 THEN 'warm'
           WHEN high_temp < 70 THEN 'cold'
           ELSE 'idk'
           END AS weather
FROM us_weather
GROUP BY high_temp
ORDER BY days DESC;

My query can have conditionals that use values from more than one column or field to create a new column. By throwing in a COUNT(*) AS number_customers, I have a useful result set for making comparisons.

SELECT COUNT(*) AS number_customers,
       CASE 
           WHEN internet_service_type_id = 1 AND device_protection = 'Yes' 
                 THEN 'DSL_Protected'
            WHEN internet_service_type_id = 1 AND device_protection = 'No' 
                 THEN 'DSL_Unprotected'
            WHEN internet_service_type_id = 2 AND device_protection = 'Yes' 
                 THEN 'Fiber_Optic_Protected'
            WHEN internet_service_type_id = 2 AND device_protection = 'No' 
                 THEN 'Fiber_Optic_Unprotected'
            WHEN internet_service_type_id = 3 
                 THEN 'No_Internet_Service'
            ELSE internet_service_type_id
            END AS internet_types
FROM customers
GROUP BY internet_types
ORDER BY number_customers;

-> 1065 DSL_Protected
   1356 DSL_Unprotected
   1357 Fiber_Optic_Protected
   1526 No_Internet_Service
   1739 Fiber_Optic_Unprotected

Normalize Values in Your Dataset

Sometimes my dataset may be messy (hahaha, sometimes), and I will need to normalize the values. CASE can come in handy here, as well. Notice that my CASE Statement syntax is slightly different here; since I'm only referencing a single column in my logic, and I'm not only testing for equality, I can write my statement like you see below.

SELECT title,
       salary,
       gender,
       CASE gender
           WHEN 'M' THEN 'male'
           WHEN '0' THEN 'male'
           WHEN 'F' THEN 'female'
           WHEN '1' THEN 'female'
           ELSE 'neither'
           END AS gender_norm
FROM us_employees;

Create a Boolean Field

Sometimes it's easier to work with a boolean value, and CASE can be helpful here, as well. Here I create a new column called is_female containing the value 1 if female and 0 if male

USE titanic_db;
SELECT sex,
       CASE sex 
           WHEN 'female' THEN 1
           ELSE 0
           END AS is_female
FROM passengers; 

-> male     0
   female   1
   female   1
   female   1
   male     0

I can also do this using the IF(condition, val_if_true, val_if_false) function in my query.

SELECT sex,
       IF(sex = 'female', True, False) AS is_female
FROM passengers;

-> male     0
   female   1
   female   1
   female   1
   male     0

Create a Dummy Variable

I can add a condition in my SELECT statement that will return 1 if the condition evaluates to True and a 1 if the condition evaluates to False.

SELECT sex, 
       sex = 'female' AS is_female
FROM passengers;

-> male     0
   female   1
   female   1
   female   1
   male     0

What Are Temporary Tables?

Temporary Tables exist only until you end your database session; if you lose your connection or close your database session, these tables will disappear.

So What?

I can create a temporary table when I want to query a subset of a dataset. I can modify a temporary table even when I don't have the needed permissions to modify data in a database.

Now What?

Create Temporary Tables

-- Create a temporary table using my database access.

USE bayes_123;

-- I want a table that brings over only the current employees in the Customer Service department.

CREATE TEMPORARY TABLE dept_employees AS
SELECT emp_no, 
       CONCAT(first_name, ' ', last_name) AS full_name, 
       dept_no, 
       dept_name,
       salary
FROM employees.employees
JOIN employees.salaries USING(emp_no)
JOIN employees.dept_emp USING(emp_no)
JOIN employees.departments USING(dept_no)
WHERE employees.departments.dept_name = 'Customer Service'
    AND employees.dept_emp.to_date > CURDATE()
    AND employees.salaries.to_date > CURDATE();
-- Once I have created my dept_employees table, I can query this subset of the employees dataset.

SELECT ROUND(AVG(salary),2) AS avg_salary
FROM dept_employees;

Modify Temporary Tables

After you create a temporary table, you can make changes.

USE bayes_123;

-- Create temporary table 

CREATE TEMPORARY TABLE temp_sakila AS
SELECT *
FROM sakila.payment;

-- Check out my data.

SELECT *
FROM temp_sakila;

-- Check out information about temp table structure.

DESCRIBE temp_sakila;

-- Transform the amount column from float to integer data type.

ALTER TABLE temp_sakila MODIFY amount INT UNSIGNED;

-- I can add new columns to a temp table.

ALTER TABLE temp_sakila ADD new_col_1 INT UNSIGNED;
ALTER TABLE temp_sakila ADD new_col_2 VARCHAR(25);

-- Insert records into a tamp table.

INSERT INTO temp_sakila (new_col_1, new_col_2) VALUES (val_1, val_2);

-- Update existing data in the temp table.

UPDATE temp_sakila SET new_col_1 = amount * 100;

-- I can update my temptable using a `WHERE` clause if I don't want to update all values in a field or fields.

UPDATE temp_sakila SET new_col_1 = some_val, new_col_2 = other_val WHERE condition;

-- I can drop columns from my temp table.

ALTER TABLE temp_sakila DROP COLUMN new_col_1;

-- You can remove a temporary table before ending a session or terminating a connection.

DROP TEMPORARY TABLE table_name;