I just pushed a change that enables CDN delivery of most of our static content (except for the music itself... for now). In theory it should make browsing the site a bit quicker, especially for those users not in the northeastern US.
If you happen to read this and notice any difference, comment below!
Monday, April 21, 2008
Saturday, April 12, 2008
EXPLAIN everything, part 2
I previously wrote a post urging people trying to optimize their database usage to run the "EXPLAIN" command on everything. This is easy advice to ignore in the case of simple queries, but even the simple queries can be big performance problems if you miss an important detail.
The other day, a friend was complaining of slow queries and was looking for some help fixing them. The first step we took was to take the "problem page" and enable debug output that printed each query that was run along with the number of milliseconds each took. If you don't have a facility for easily doing this (at the very least on a staging or development server) then you are essentially wandering in the dark for optimization.
Looking at the debug output, we found something surprising. The slowest query on the page was the seemingly innocuous:
My friend was sure there was an index on
The entry in the
So, the next question was why mysql was deciding not to use the index for this query. Doing a
this:
For any given numeric value in the query, there are several varchar representations that are "equal" to it (shown as value 1 above). As such, there is no way for mysql to do an index lookup on a varchar column to satisfy an equality condition against a numeric constant.
One easy solution would be to change the
What was the solution, then? We simply added quotes to the query:
This made the equality comparison between two varchars, and the index was used. The query now runs consistently in less than a few milliseconds.
In summary,
The other day, a friend was complaining of slow queries and was looking for some help fixing them. The first step we took was to take the "problem page" and enable debug output that printed each query that was run along with the number of milliseconds each took. If you don't have a facility for easily doing this (at the very least on a staging or development server) then you are essentially wandering in the dark for optimization.
Looking at the debug output, we found something surprising. The slowest query on the page was the seemingly innocuous:
SELECT `User`.`id` FROM `users` AS `User` WHERE `opensocial_id` = 219771253 LIMIT 1;
My friend was sure there was an index on
User.opensocial_id
yet the debug output showed that it took more than a second. With an index, this should take less than a millisecond. I asked him to run an EXPLAIN
on it, and we saw:
mysql> EXPLAIN SELECT `User`.`id` FROM `users` AS `User` WHERE `opensocial_id` = 2778153 LIMIT 1;
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | User | ALL | opensocial_id | NULL | NULL | NULL | 272877 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)
The entry in the
possible_keys
column confirmed that there indeed was an index on opensocial_id
. However, NULL
in the key
column meant that mysql was deciding the opensocial_id
key was either unusable or didn't have good enough selectivity to choose over a full table scan. The high value in the rows
column along with Using where
under Extra
indicated that mysql was reading through every row in the table, running a comparison, and outputting those rows for which the condition matched.So, the next question was why mysql was deciding not to use the index for this query. Doing a
SHOW CREATE TABLE User
revealed that the opensocial_id
column was of type varchar(30)
. Because the column was a varchar
, mysql couldn't use an index on it to do a comparison against a numeric query. You can see the reason forthis:
mysql> select "0234" = 234\G
*************************** 1. row ***************************
"0234" = 234: 1
mysql> select "234.0" = 234\G
*************************** 1. row ***************************
"234.0" = 234: 1
1 row in set (0.00 sec)
For any given numeric value in the query, there are several varchar representations that are "equal" to it (shown as value 1 above). As such, there is no way for mysql to do an index lookup on a varchar column to satisfy an equality condition against a numeric constant.
One easy solution would be to change the
varchar(30)
to a numeric type. Unfortunately, it turns out that opensocial user identifiers are not simple 64-bit numeric values -- depending on the opensocial platform provider they are sometimes zero-padded, sometimes not, etc. So, using a bigint unsigned
was not an option.What was the solution, then? We simply added quotes to the query:
SELECT `User`.`id` from `users` as `User` WHERE `opensocial_id` = '23424234' LIMIT 1;
This made the equality comparison between two varchars, and the index was used. The query now runs consistently in less than a few milliseconds.
In summary,
EXPLAIN
everything! Even the queries that look incredibly simple can be incredibly slow if they're not using the index you think they are.
Wednesday, April 09, 2008
PHP Pop Quiz
What would you expect the following code to do?
I think most rational people would expect the program to exit with the message "goodbye world."
It turns out that PHP is not so rational. Because
The solution, of course, is to use the odd
The problem with that method, though, is that
Is there a reasonable solution to this?
$foo = 0;
if ($foo == "DB") {
die("hello world");
}
die("goodbye world");
?>
I think most rational people would expect the program to exit with the message "goodbye world."
It turns out that PHP is not so rational. Because
$foo
is a numeric value, it's compared using a numeric comparator with "DB"
. This typecast coerces "DB"
into an integer value 0, so the equality matches.The solution, of course, is to use the odd
===
operator which is unique to PHP. This means something along the lines of "I really mean it. No, seriously!" and disables the coercion of the dissimilar types.The problem with that method, though, is that
"0" === 0
evaluates to false.Is there a reasonable solution to this?
Subscribe to:
Posts (Atom)