Sunday, March 02, 2008

Always EXPLAIN your queries

One of the first tips if you're working on improving your database speed is to always run EXPLAIN on your queries. This lets you see the plan of execution that MySQL will take, and evaluate whether and how your indexes will be used.

Usually I only bother to take my own advice for complicated queries involving GROUP BY, subselects, etc. Tonight, though, I was working on optimizing some of our pages by moving data into summary tables, and a query took a lot longer than I expected:


mysql> select SQL_NO_CACHE s.title, ar.name from SongBuzz sb JOIN Song s ON sb.song_id = s.id JOIN Album al ON s.album_id = al.id JOIN Artist ar ON al.artist_id = ar.id ORDER BY sb.recs_1w desc limit 3;
+----------------------+------------------+
| title | name |
+----------------------+------------------+
| Deep Sea Green | Mandyleigh Storm |
| Suffer For Fashion | of Montreal |
| Just Enough | Dan Tharp |
+----------------------+------------------+
3 rows in set (1.17 sec)


All of the foreign key fields in the JOIN were indexed, as was sb.recs_1w. There's no reason this query should take 1.2 seconds when the entire data fits easily in RAM. So I ran EXPLAIN:


mysql> EXPLAIN select SQL_NO_CACHE s.title, ar.name from SongBuzz sb JOIN Song s ON sb.song_id = s.id JOIN Album al ON s.album_id = al.id JOIN Artist ar ON al.artist_id = ar.id ORDER BY sb.recs_1w desc limit 3;
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------------+-------+---------------------------------+
| 1 | SIMPLE | ar | ALL | PRIMARY | NULL | NULL | NULL | [redacted big number] | Using temporary; Using filesort |
| 1 | SIMPLE | al | ref | PRIMARY,artist_id | artist_id | 4 | amiest_production.ar.id | 1 | Using index |
| 1 | SIMPLE | s | ref | PRIMARY,album_id | album_id | 4 | amiest_production.al.id | 4 | |
| 1 | SIMPLE | sb | eq_ref | PRIMARY | PRIMARY | 4 | amiest_production.s.id | 1 | |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------------+-------+---------------------------------+
4 rows in set (0.02 sec)


From this query plan, it was clear that MySQL was first reading the artist table, then joining to find all the albums, then joining to find all the songs, then joining against the SongBuzz summary table. After getting all these rows, it was sorting the whole thing in a temporary table. The more logical plan would be to sort the SongBuzz table by recs_1w, since that column is already indexed, grab the top 3 rows, and then join through to find the song and artist names. Using the STRAIGHT_JOIN flag to the SELECT statement, we can force this:


mysql> explain select SQL_NO_CACHE STRAIGHT_JOIN s.title, ar.name from SongBuzz sb JOIN Song s ON sb.song_id = s.id JOIN Album al ON s.album_id = al.id JOIN Artist ar ON al.artist_id = ar.id ORDER BY sb.recs_1w desc limit 10;
+----+-------------+-------+--------+-------------------+---------+---------+--------------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------+---------+---------+--------------------------------+--------+-------------+
| 1 | SIMPLE | sb | index | PRIMARY | recs_1w | 2 | NULL | [redacted big number] | Using index |
| 1 | SIMPLE | s | eq_ref | PRIMARY,album_id | PRIMARY | 4 | amiest_production.sb.song_id | 1 | |
| 1 | SIMPLE | al | eq_ref | PRIMARY,artist_id | PRIMARY | 4 | amiest_production.s.album_id | 1 | |
| 1 | SIMPLE | ar | eq_ref | PRIMARY | PRIMARY | 4 | amiest_production.al.artist_id | 1 | |
+----+-------------+-------+--------+-------------------+---------+---------+--------------------------------+--------+-------------+


and the resulting query executes somewhere around 60x faster:


mysql> select SQL_NO_CACHE STRAIGHT_JOIN s.title, ar.name from SongBuzz sb JOIN Song s ON sb.song_id = s.id JOIN Album al ON s.album_id = al.id JOIN Artist ar ON al.artist_id = ar.id ORDER BY sb.recs_1w desc limit 3;
+----------------------+------------------+
| title | name |
+----------------------+------------------+
| Deep Sea Green | Mandyleigh Storm |
| Suffer For Fashion | of Montreal |
| Just Enough | Dan Tharp |
+----------------------+------------------+
3 rows in set (0.02 sec)

1 comment:

Mandyleigh Storm said...

I don't have a clue what you were talking about, but it's nice to see my name there :) Hopefully I wasn't the cause of the problem :)

Love always,

Mandyleigh XXX