Just like an index in the back of a text book; a table index allows us to find the information we are looking for, quickly. For the purposes of this post, let’s assume we have this table structure:
CREATE TABLE ran.order_data ( first_name VARCHAR(20) NOT NULL, last_name VARCHAR(20) NOT NULL, company_name VARCHAR(20) NOT NULL, lots of other fields.... )
We really want to index our table on high cardinality fields (lots of unique values in the field), but for the purposes of this demonstration, we’ll use the above data structure.
We can index on a single field, in this case last_name. This will speed up the lookup of the last_name field, as it’ll mimic going to the back pages of your textbook, to find out where to look.
ALTER TABLE ran.order_data ADD INDEX (last_name);
We can add multiple index fields on our table too. This is where we must remember a particular rule. We must always query our index fields in the order that they are defined. Let’s use an example. Here, we cannot use the forename ONLY in our where statement & still take advantage of indexing. This is because, it’s sorted by the left most index & so there is no benefit in querying the second indexed field in isolation.
So we can index our table based on the two fields, like below. We can then query WHERE last_name = ‘x’ and it will use an index to look up. We can also use WHERE last_name = ‘x’ and first_name = ‘y’ and it will use an index.
However, if we run WHERE first_name = ‘y’ in isolation (without querying the last_name column), it will not use an index for the lookup. You can verify this by using the EXPLAIN clause in your query to see which indexes have been used.
ALTER TABLE ran.order_data ADD INDEX (last_name, first_name);
You can optionally name your indexes on your table, which will help you identify the index used when using the EXPLAIN function – it’ll provide a ‘friendly’ name. This name can be used to edit or delete the index in future too.
INDEX 'MyIndex'('first_name', 'last_name')
We must be mindful about how many columns we index. Indexing every field may seem tempting, but an index:
- Takes up storage space
- Has quite an overhead – it’s updated with every row added, updated or removed
- The query optimizer considers the index in every query, so more indexes = more work
Overall, it’s important to remember that an indexed field should be relatively high cardinality and we should use enough indexes to achieve what we need but not so many that we start to degrade performance.