Sometimes it's necessary to execute queries involving our actual tables along with some temporary or derived data.
One common type of temporary data that might be used in this sort of case is an external list of transactions (maybe in CSV format), which we would import into our database so that we could join them (JOIN clause) with our actual tables in order to find any missing transactions; or perhaps just to clean and transform the data before it’s finally loaded in the target table.
One common type of derived data would be that resulting from the joining of several tables or from complex calculations (which are sometimes very expensive to reprocess). We could use this kind of intermediate result set for filtering or further joining operations.
A good strategy is to create temporary tables for storing this type of intermediate data for further processing within our session. The key advantage of using temporary tables is that they are visible only to the current session and they are dropped automatically when the session is closed.
In this article, I want to explain how MySQL works with temporary tables; both explicitly as part of the CREATE TEMPORARY TABLE statement and implicitly as part of the execution plan for complex SELECT statements.
Creating temporary tables explicitly
Let’s suppose that your application requires executing a complex calculation on a set of tables and reusing this result in your business logic.
If you execute the calculation every time you need to use the result, you might degrade the performance of your solution and use unnecessary resources. So, the idea is to store the result in an intermediate structure. You could use another table, or maybe a materialized view, but the derived data is just used on the business logic associated with your session. To manage the data structure easily, it would be better to use a temporary table, because it’s automatically destroyed when the session/connection is closed.
You can use the CREATE TEMPORARY TABLE statement to create temporary tables, as shown in the listing 01. Remember, they’re visible only within the current session, and they’re dropped automatically when the session is closed. Therefore, two different sessions can use the same temporary table name in their business logic without conflicting with each other. Of course, in the same session, two temporary tables cannot share the same name.
mysql> CREATE TEMPORARY TABLE temp_table (id int primary key auto_increment, payload int) ENGINE=MyISAM;
It’s noteworthy that we can specify the engine for temporary tables, as shown in listing 01. If we don’t specify the engine, then the table is created using the default storage engine. For example, the default engine is InnoDB in MySQL 5.6.
For example, we could specify the MEMORY engine so that the data is stored in memory, as shown in the listing 02.
mysql> CREATE TEMPORARY TABLE temp_table_myisam (x int) ENGINE=MyISAM;
If we execute SHOW TABLES statements, we see no temporary table listed.
If we log out of the MySQL session and then log in again and try to execute a SELECT command on the temporary table, we will get an error, because the table doesn’t exist.
Let’s find out where the temporary tables are actually located in the filesystem, by querying to MySQL about the location, as shown in listing 03.
mysql> SHOW GLOBAL VARIABLES like 'tmpdir';
Then, we can list the /tmp directory in the filesystem, as shown in listing 04.
$ ls -l /tmp
We can see that temporary tables are created in the /tmp directory and they have unusual names (random names starting with # character) in order to avoid collision between the same table name in different sessions. We can check out the frm, myd and myi files as shown in the listing 04.
By default, all temporary tables are removed by MySQL when the connection is closed. However, if you develop an application that uses connection pools (commonly in Web applications) or persistent connections (commonly in desktop applications), then there is no guarantee that the temporary tables will be removed automatically when your business logic is executed, because the database connection may be still physically open.
Therefore, a best practice is to always remove temporary tables manually when the business logic doesn’t use them. This practice will avoid conflicts when serving different requests that use a temporary table and the application uses persistent connections because the temporary tables live while the connection is still open.
You can delete a temporary table as shown in listing 05.
mysql> DROP TEMPORARY TABLE temp_table;
Creating temporary tables implicitly
There are scenarios where MySQL creates internal temporary tables while processing queries.
How can we find out if MySQL is creating internal temporary tables?
Simple: Just check the execution plan of the query for the "Using temporary" sentence.
Notice that the "Using temporary" sentence doesn’t mean the temporary table has been created on disk. It only indicates that the query expects to create a temporary table. Implicit temporary tables are still being created in memory using the MEMORY engine while the MyISAM engine is being used for on-disk tables.
An implicit temporary table that has been created using the MEMORY engine (by default) is automatically converted into a MyISAM table when:
- TEXT or BLOB fields are present (the MEMORY engine doesn’t support these types)
- The size of the resulting implicit temporary table is greater than the lower value of tmp_table_size and max_heap_table_size. In other words, the maximum size for in-memory temporary tables is determined from whichever of the values of tmp_table_size and max_heap_table_sizeis smaller. Notice that:
- tmp_table_size is the largest a table can be in memory when it is created automatically by a query. But this can’t be larger than max_heap_table_size anyway.
- max_heap_table_size is the largest size a table can be in the MEMORY storage engine, whether that table is a temp table or non-temp table.
- It's common to set these two variables to the same value.
- When we increase the limits of tmp_table_size and max_heap_table_size, we allow larger temporary tables to live in memory.
- One strategy is to allow most of the temporary tables to live in memory to improve the performance (let’s say 90% of the temporary tables) and move only the very large tables onto disk (the remaining 10%)
- A column with more than 512 bytes is used with either a GROUP BY or UNION or ORDER BY
In this article, I've explained the concepts, common use cases, and benefits related of temporary tables in MySQL. You’ve learned how MySQL internally manages temporary tables.
Now you can start using this solution strategy in your own database applications.