This week I encountered an issue that I hadn’t seen in a while. The ORM in a CMS project that I work on automatically joins to many subclass tables, causing normal queries to load an object to be joined to a dozen tables or so. Then when you combine that to join to another of the same type of object, you can easily be joining to twenty or more tables.
We noticed a query in the slow query log that was running a fairly simple query, with straight-forward criteria in the where clause that was well-indexed with good cardinality. All of the joins were done on primary/foreign key combinations that were properly indexed. But this simple query was consistently taking over 1.5 seconds to run.
So, of course, I ran the query through "DESCRIBE <query>"
in MySQL and saw that if I
changed the order that the tables were joined, MySQL altered the query plan slightly. The
other option that resulted in the more efficient query plan was providing optimizer
(index) hints, which also resulted in a different order of joining the tables. I even
considered whether or not this might be one of those strange situations where doing a
straight join might be needed. But I didn’t want to rely on any of these hacks as the way
to “fix” this problem, for two reasons:
- they are mostly non-deterministic (re-ordering the table joins isn’t guaranteed to continue providing similar results in production or even over time as the dataset changes and the optimizer adapts)
- we don’t really have control over the order of the joins since these queries are built by an ORM
Then it dawned on me: even when I was running "DESCRIBE <query>"
in MySQL, it was taking
over 1.5 seconds, but when I ran "DESCRIBE <tweaked query>"
it was only taking 0.04
seconds. This meant that it wasn’t actually running the query that took so long; it was
the time MySQL took to figure out the query plan! The fix? The MySQL
optimizer_search_depth
parameter (more info
here)
Essentially, this setting tells MySQL how hard it should try to find the absolutely most
efficient query plan for a given query. The more joins you have, the harder MySQL tries to
find, and the more possibilities that it has to filter through. And, to top it all off,
the default is a horrid 62! By changing this value to zero, you can tell MySQL “you
decide how hard to try”, which seems to mean: min($numberOfTables, 7)
according to one
source. For us, changing this value to zero resulted in the query running in 0.04 seconds
or less. Essentially, MySQL will try a few query plans (instead of a ton) and then pick
the best of those, and since all your queries should be using indexes anyway (your queries
all are well indexed, right?) even if it’s not the absolute best query plan, it’s still
much better than actually searching through every possible query plan!
For more info, see Peter’s great blog on the MySQL Performance Blog: http://www.mysqlperformanceblog.com/2012/04/20/joining-many-tables-in-mysql-optimizer_search_depth/