Data and SQL

Shivam Choudhary
7 min readJun 9, 2021

Structured Query Language / sometime See-Qwell

Transforming data in SQL

Data analysts usually need to convert data from one format to another to complete an analysis. But what if you are using SQL rather than a spreadsheet? Just like spreadsheets, SQL uses standard rules to convert one type of data to another. If you are wondering why data transformation is an important skill to have as a data analyst, think of it like being a driver who is able to change a flat tire. Being able to convert data to the right format speeds you along in your analysis. You don’t have to wait for someone else to convert the data for you.

In this reading, you will go over the conversions that can be done using the CAST function. There are also more specialized functions like COERCION to work with big numbers, and UNIX_DATE to work with dates.

UNIX_DATE returns the number of days that have passed since January 1, 1970 and is used to compare and work with dates across multiple time zones. You will likely use CAST most often.

Common conversions

The following table summarizes some of the more common conversions made with the CAST function. Refer to Conversion Rules in Standard SQL for a full list of functions and associated rules.

Starting with CAST function can convert to:

Numeric (number) → Integer, Numeric (number) , Big number, Floating integer, String

String → Boolean, Integer, Numeric (number), Big number, Floating integer, String, Bytes , Date, Date time, Time, Timestamp

Date → String — Date — Date time — Timestamp

The CAST function (syntax and examples)

CAST is an American National Standards Institute (ANSI) function used in lots of programming languages, including Big Query. This section provides the Big Query syntax and examples of converting the data types in the first column of the previous table. The syntax for the CAST function is as follows:

Where expression is the data to be converted and typename is the data type to be returned.

Converting a number to a string

The following CAST statement returns a string from a numeric identified by the variable MyCount in the table called MyTable.

In the above SQL statement, the following occurs:

  • SELECT indicates that you will be selecting data from a table
  • CAST indicates that you will be converting the data you select to a different data type
  • AS comes before and identifies the data type which you are casting to
  • STRING indicates that you are converting the data to a string
  • FROM indicates which table you are selecting the data from

Converting a string to a number

The following CAST statement returns an integer from a string identified by the variable MyVarcharCol in the table called MyTable. (An integer is any whole number.)

In the above SQL statement, the following occurs:

  • SELECT indicates that you will be selecting data from a table
  • CAST indicates that you will be converting the data you select to a different data type
  • AS comes before and identifies the data type which you are casting to
  • INT indicates that you are converting the data to an integer
  • FROM indicates which table you are selecting the data from

Converting a date to a string

The following CAST statement returns a string from a date identified by the variable MyDate in the table called MyTable.

In the above SQL statement, the following occurs:

  • SELECT indicates that you will be selecting data from a table
  • CAST indicates that you will be converting the data you select to a different data type
  • AS comes before and identifies the data type which you are casting to
  • STRING indicates that you are converting the data to a string
  • FROM indicates which table you are selecting the data from

Converting a date to a datetime

Datetime values have the format of YYYY-MM-DD hh: mm: ss format, so date and time are retained together. The following CAST statement returns a datetime value from a date.

In the above SQL statement, the following occurs:

  • SELECT indicates that you will be selecting data from a table
  • CAST indicates that you will be converting the data you select to a different data type
  • AS comes before and identifies the data type which you are casting to
  • DATETIME indicates that you are converting the data to a datetime value
  • FROM indicates which table you are selecting the data from

The SAFE_CAST function

Using the CAST function in a query that fails returns an error in Big Query. To avoid errors in the event of a failed query, use the SAFE_CAST function instead. The SAFE_CAST function returns a value of Null instead of an error when a query fails.

The syntax for SAFE_CAST is the same as for CAST. Simply substitute the function directly in your queries. The following SAFE_CAST statement returns a string from a date.

Manipulating strings in SQL

Knowing how to convert and manipulate your data for an accurate analysis is an important part of a data analyst’s job. You will learn about different SQL functions and their usage, especially regarding string combinations.

A string is a set of characters that helps to declare the texts in programming languages such as SQL. SQL string functions are used to obtain various information about the characters, or in this case, manipulate them. One such function, CONCAT, is commonly used. Review the table below to learn more about the CONCAT function and its variations.

Function Usage Example CONCATA function that adds strings together to create new text strings that can be used as unique keys CONCAT (‘Google’, ‘.com’);CONCAT_WSA function that adds two or more strings together with a separator CONCAT_WS (‘ . ’, ‘www’, ‘google’, ‘com’) *The separator (being the period) gets input before and after Google when you run the SQL function CONCAT with +Adds two or more strings together using the + operator ‘Google’ + ‘.com’

CONCAT at work

When adding two strings together such as ‘Data’ and ‘analysis’, it will be input like this:

  • SELECT CONCAT (‘Data’, ‘analysis’);

The result will be:

  • Data analysis

Sometimes, depending on the strings, you will need to add a space character, so your function should actually be:

  • SELECT CONCAT (‘Data’, ‘ ’, ‘analysis’);

And the result will be:

  • Data analysis

The same rule applies when combining three strings together. For example,

  • SELECT CONCAT (‘Data’,’ ‘, ‘analysis’, ‘ ‘, ‘is’, ‘ ‘, ‘awesome!’);

And the result will be

  • Data analysis is awesome!

Practice makes perfect

W3 Schools is an excellent resource for interactive SQL learning, and the following links will guide you through transforming your data using SQL:

  • SQL functions: This is a comprehensive list of functions to get you started. Click on each function, where you will learn about the definition, usage, examples, and even be able to create and run your own query for practice. Try it out for yourself!
  • SQL Keywords: This is a helpful SQL keywords reference to bookmark as you increase your knowledge of SQL. This list of keywords are reserved words that you will use as your need to perform different operations in the database grows.
  • While this reading went through the basics of each of these functions, there is still more to learn, and you can even combine your own strings.
  1. Practice using CONCAT
  2. Practice using CONCAT WS
  3. Practice using CONCAT with +

Pro tip: The functions presented in the resources above may be applied in slightly different ways depending on the database that you are using (e.g. MySQL versus SQL Server). But, the general description provided for each function will prepare you to customize how you use these functions as needed.

More information

Browse these resources for more information about data conversion using other SQL dialects (instead of Big Query):

--

--