The other day, a friend was complaining of slow queries and was looking for some help fixing them. The first step we took was to take the "problem page" and enable debug output that printed each query that was run along with the number of milliseconds each took. If you don't have a facility for easily doing this (at the very least on a staging or development server) then you are essentially wandering in the dark for optimization.
Looking at the debug output, we found something surprising. The slowest query on the page was the seemingly innocuous:
SELECT `User`.`id` FROM `users` AS `User` WHERE `opensocial_id` = 219771253 LIMIT 1;
My friend was sure there was an index on
User.opensocial_id
yet the debug output showed that it took more than a second. With an index, this should take less than a millisecond. I asked him to run an EXPLAIN
on it, and we saw:
mysql> EXPLAIN SELECT `User`.`id` FROM `users` AS `User` WHERE `opensocial_id` = 2778153 LIMIT 1;
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | User | ALL | opensocial_id | NULL | NULL | NULL | 272877 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)
The entry in the
possible_keys
column confirmed that there indeed was an index on opensocial_id
. However, NULL
in the key
column meant that mysql was deciding the opensocial_id
key was either unusable or didn't have good enough selectivity to choose over a full table scan. The high value in the rows
column along with Using where
under Extra
indicated that mysql was reading through every row in the table, running a comparison, and outputting those rows for which the condition matched.So, the next question was why mysql was deciding not to use the index for this query. Doing a
SHOW CREATE TABLE User
revealed that the opensocial_id
column was of type varchar(30)
. Because the column was a varchar
, mysql couldn't use an index on it to do a comparison against a numeric query. You can see the reason forthis:
mysql> select "0234" = 234\G
*************************** 1. row ***************************
"0234" = 234: 1
mysql> select "234.0" = 234\G
*************************** 1. row ***************************
"234.0" = 234: 1
1 row in set (0.00 sec)
For any given numeric value in the query, there are several varchar representations that are "equal" to it (shown as value 1 above). As such, there is no way for mysql to do an index lookup on a varchar column to satisfy an equality condition against a numeric constant.
One easy solution would be to change the
varchar(30)
to a numeric type. Unfortunately, it turns out that opensocial user identifiers are not simple 64-bit numeric values -- depending on the opensocial platform provider they are sometimes zero-padded, sometimes not, etc. So, using a bigint unsigned
was not an option.What was the solution, then? We simply added quotes to the query:
SELECT `User`.`id` from `users` as `User` WHERE `opensocial_id` = '23424234' LIMIT 1;
This made the equality comparison between two varchars, and the index was used. The query now runs consistently in less than a few milliseconds.
In summary,
EXPLAIN
everything! Even the queries that look incredibly simple can be incredibly slow if they're not using the index you think they are.
No comments:
Post a Comment