MySQL Querying A Partitioned Table

MySQL Querying A Partitioned Table

Querying partitions in MySQL is a nice straightforward process. However, if you use a general range in the fields, it often still ‘touches’ all of the partitions until it identifies which it should use for the query. So I like to be super specific about the partitions I want it to search.

Below is the table structure we will use for the guide:

CREATE TABLE ran.order_data (
    username VARCHAR(20) NOT NULL,
    order_date DATETIME NOT NULL,
    PRIMARY KEY(username, order_date)
)
PARTITION BY RANGE( YEAR(order_date) )(
    PARTITION p2015 VALUES LESS THAN (2016),
    PARTITION p2016 VALUES LESS THAN (2017),
    PARTITION p2017 VALUES LESS THAN (2018))

We can select data using the below syntax for a single partition:

SELECT * FROM order_data PARTITION (p2017);

Or the below for multiple partitions:

SELECT * FROM order_data PARTITION (p2017, p2016)

But that is not all that useful if you have lots of partitions. For example, if you have a partition per month & year; you will likely have 12 partitions a year. So if you query the last 3 years, you would have to comma separate 36 values.

So, we can get around this using the below kind of query. Let’s walk through it:

  1. T1 is a group_concat from the information schema table. This provides a comma separated list, rather than many rows of output.
  2. We load that comma sepated list into a variable called ‘items’ to use later
  3. We then create CTE T2, where we are preparing the query we want to execute, which injects the @items variable into the query
  4. We then dump the output of the query into a variable called ‘query’
  5. Finally, we prepare the statement for execution using our variable & then execute it
WITH t1 as (SELECT GROUP_CONCAT(PARTITION_NAME) FROM information_schema.PARTITIONS
WHERE TABLE_SCHEMA = 'ran' AND TABLE_NAME = 'order_data' and PARTITION_DESCRIPTION > 2016)

SELECT * INTO @items FROM t1;

WITH t2 as (select concat('select * from order_data PARTITION(',@items,')'))

SELECT * INTO @query FROM t2;

PREPARE stmt FROM @query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

This may look like a messy way to solve the problem, but it works quite well & ensures you are restricting the partitions you are querying as much as possible.

But what about partition pruning?

The above approach ensures you are only scanning the partitions you absolutely require for your query.

However, MYSQL does also have an automated approach to pruning partitions, whereby it will assess the conditions of your WHERE statement and select only the required partitions.

Let’s first look at where partition pruning does not work. Here, we have a table where we have derived YEAR_MONTH from the order_date field to get our partition.

CREATE TABLE ran.multi_field (
    username VARCHAR(20) NOT NULL,
    order_date DATETIME NOT NULL,
    PRIMARY KEY(username, order_date)
)
PARTITION BY RANGE( extract(year_month from `order_date`) )(
    PARTITION p2021 VALUES LESS THAN (202201),
    PARTITION p202201 VALUES LESS THAN (202202),
    PARTITION p202202 VALUES LESS THAN (202203))

You can see how it’ll work by running an EXPLAIN command. On the below, it searches ALL the partitions, rather than only those required.

EXPLAIN select * from ran.order_data where order_date > '2021/12/01'

Even running the below, where we specify the criteria in the same way the partition has been created, still searches all partitions.

EXPLAIN select * from ran.multi_field where extract(year_month from `order_date`) > 202202

So that is no good. But, running the below, returns a single searched partition:

WITH t1 as (SELECT GROUP_CONCAT(PARTITION_NAME) FROM information_schema.PARTITIONS
WHERE TABLE_SCHEMA = 'ran' AND TABLE_NAME = 'multi_field' and PARTITION_DESCRIPTION > 202202)

SELECT * INTO @items FROM t1;

WITH t2 as (select concat('explain select * from multi_field PARTITION(',@items,')'))

SELECT * INTO @query FROM t2;

PREPARE stmt FROM @query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

So, automated pruning does not work in the above instance. In the below, it does. Here, we just take the YEAR or order_date to form the partition. This is because MySQL will take into account YEAR(), TO_DAYS() or TO_SECONDS() only for date based partitions.

CREATE TABLE ran.order_data (
    username VARCHAR(20) NOT NULL,
    order_date DATETIME NOT NULL,
    PRIMARY KEY(username, order_date)
)
PARTITION BY RANGE( YEAR(order_date) )(
    PARTITION p2015 VALUES LESS THAN (2016),
    PARTITION p2016 VALUES LESS THAN (2017),
    PARTITION p2017 VALUES LESS THAN (2018))

When we run the below explain, we will see it only serches one partition ‘p2017’.

explain select * from ran.order_data where order_date > '2017-06-10'

For absolute certainty around which approach you should use, you can use the EXPLAIN query function & assess whether automatic partition pruning will take effect.

Total
0
Shares
Previous Article
Partitioning Overview in MYSQL

Partitioning Overview in MYSQL

Next Article
How to create tonnes of dummy data in MySQL

How to create tonnes of dummy data in MySQL

Related Posts