Post meta queries bring down databases, cripple traffic scaling, and dramatically inflate page load times. Thankfully there’s an almost trivial solution.
Searching for Meta
The problem comes when you need to query for post meta. This is a data design failure. Examples include:
- A meta key for posts that should appear on the homepage
- Marking a post as belonging to another post to be shown in a list
- Bundling posts together into threads
- Setting a type, such as the type of fabric of a post of type t-shirt
- Setting an items colour
These are all examples of post meta that we want to search for. We might want to show only the red items, or the t-shirts made of cotton. These sound reasonable, but the query to find those posts with that meta can have monstrous performance costs.
If post meta was built to be searched, we’d be using it to store categories and tags.
Isn’t this just good/bad practice pedantry?
No, sites have been brought down by this, and it’s the reason a number of popular plugins don’t scale to high traffic. Install Query Monitor on a slow site, and you’ll see these queries. Page builders and meta box frameworks are particularly prone to enabling these kinds of queries.
Working at WordPress.com VIP, we see sites with meta queries handle poorly, and aren’t stable with medium/high traffic. Even in local environments, refactoring meta queries dramatically improves page load times. I’ve seen busy sites fall over after introducing simple meta queries.
This isn’t best practice or poor indentation. Meta queries cost real time and money. Adding more servers and faster CPUs won’t fix the problem because it’s a fundamentally expensive thing to do.
Can We Cache this?
Yes and no…
Caching will improve the best possible outcome, but you still have to run the query at least once. Coupled with timing issues your problem won’t be as big but you’ll still have a problem.
All of this is irrelevant though if your queries are too slow to finish, it’ll never complete in time to be cached
Can I Adjust The Database?
Yes and no..
You can add an Index to the post meta table, but this isn’t a true fix because:
- the index will only store the beginning of the meta value, this works well for simple meta values such as a number, but for anything else it will be just as slow
- the more you store the less efficient the index, storing the entire meta value negates the point of the index
- It’s still slow and gives a less than stellar speed bump
- You have to create the index yourself, requiring technical skill many don’t have ( some SQL foo )
A meta index may help if your meta value is a short numeric ID, but the gains are minimal
The Easy Fix
The problem is that these meta values should be terms in a custom taxonomy. Taxonomies are built to organise things and provide a way of filtering down to a specific set of posts. Post meta should only be used for information that isn’t going to be searched of filtered for.
Lets take our previous examples and look at how a custom taxonomy would be used:
- Indicating that a post should appear on the homepage
The best way to do this is to have a custom taxonomy that controls where the post is shown, that has the homepage checked by default. A lot of people use post meta, and then go one step further by using post meta to say what they don’t want on the homepage. Those queries are orders of magnitude worse than a simple post meta query, and hideously expensive. - Marking a post as belonging to another post to be shown in a list, or bundling posts together into threads
A thread taxonomy where the name of each term is the Post ID of the original post, or a named thread. This taxonomy could be a hidden taxonomy with no UI that works behind the scenes - Setting a type, such as the type of fabric of a post of type
t-shirt
A Fabric material taxonomy - Setting an items colour
A colour taxonomy with terms for each colour
But What’s The Catch?
The upfront cost of converting your data from post meta to terms is the main downside. To do this I would use a WP CLI command to convert the data. Keep the post meta data around so that you can compare the two queries.
A taxonomy query is both simpler to write, significantly faster to run, and comes with free benefits, such as:
- A database layout that’s optimised for these kinds of queries
- The queries are much easier to type
- A free UI in the admin interface with menu options
- Term meta
- Easy REST API support
- A free set of archives, I have a talk taxonomy and you can see all my WordPress talks here, or all my WP CLI talks
- A set of theme templates,
taxonomy.php
,taxonomy-{taxname}.php
,taxonomy-{taxname}{termID}.php
, etc
Taxonomies simplify things while providing plenty of benefits.
Super interesting post, thank you!
I must admit, up until now I used to only consider taxonomies whenever the data I wanted to save was very similar to categories or tags. For everything else, I used post meta. I’ll try to change that!
I have a few questions, though:
1) What do you do for post meta that is usually saved as an array, with multiple values? This little plugin of mine for example stores 2 different colours for each post. Would you suggest creating 2 different taxonomies?
2) It’s relatively easy to create custom meta boxes in the admin, allowing one to create user-friendly input fields for custom meta data. To take the example above, I add a colour picker to the post editor. What do you do when data has to be saved as a taxonomy term?
Thanks 😀
1) I must admit, exact hex or RGB value colours don’t lend themselves to terms, you’d end up with hundreds of terms that have a single post, so it’s a case of precision and use case.
The same is true of prices, you wouldn’t search for $4.98, but you might want to show all products between $1 and $5. I’d do this by adding a price range taxonomy, defining my terms as buckets, then putting products in each bucket when they’re saved/updated.
The same strategy might work for colours, but you’d need some rules about how to define those buckets.
But if on the other hand you mean an example of ‘red & blue’ or ‘green & orange’, I’d have a non-hierarchical taxonomy and use a tax_query to filter.
TLDR: Store it as a post meta for precision when displaying on the page or doing calculations, but store a more general version as a term that you can filter on for searching and browsing
2) While I’m not sure that the data a colour picker generates is best stored in a term, creating a metabox should be just as easy as post meta but using `wp_set_post_terms` etc instead.
Admittedly people have had a lot more time and incentive to build meta box frameworks, but one of the more popular and well tested/scalable frameworks Field Manager by Alley Interactive supports this. When defining a field, tell it to use a term datasource object. Coincidentally there’s a Colourpicker field.
Our friend Alex King (RIP) had some similar advice ~5 years ago: http://alexking.org/blog/2011/08/29/wordpress-post-meta-taxonomies
Good stuff, and thank you for sharing based on your experience.
Nice post, I’m going to add it to the further reading section 🙂
Indexes on meta_value won’t really help with WP_Query, because internally WordPress does CAST(), which needs to read and convert all values, before filtering any of them out, which means the index can’t really be used, unless you hack into the generated queries and try to strip away the CAST(), in which case you may have other problems when working with numeric values.
Also surprised you haven’t mentioned Elasticsearch on Sphinx. That would go under a “Not so easy fix” heading, but once it’s all set up, it works like a charm, especially with 10up’s ElasticPress plugin.
So from reading both yours and Alex’s articles, would it be right to say that it’s fine to use custom fields (post meta) for elements on a page which won’t be searched for – ie Phone Number, Gallery, link to external website, Google Map etc? But then to use custom taxonomies for elements which will be queried – ie to display all posts which are “Self Catering or B&Bs”, “Dogs Welcome/Not Welcome” or “Has Wifi”?
We recently released a plugin to automatically fix this issue for location data, which is a major offender of post meta abuse:
https://wordpress.org/plugins/wp-geometa/
Hi Tom,
This is the kind of posts I LOVE to read, just perfect for people like me = not developers, but able to understand when explained clearly 🙂 and I also like the fact that the “you shouldn’t” is followed by the “why” … thanks a lot!
Are you on your gravatar with Finnish’ Moominpappa? If yes, I must friend you somewhere on social networks absolutely 😉
Cheers!
Thanks 🙂 It’s from the Moomin Café in Hong Kong, they sit you down with giant moomins while you eat
Hong Kong!! Good to know 🙂 Thx
Great post Tom. Both eye opening and frustrating in equal measure, as the more performant solution is right under our noses!
Great timing of this post – it made me realise I should be using a taxonomy instead of custom field for a project I’m working on at this very moment, and inspired me to optimize another query to be way less complex. Thank you! 🙂
I noticed this, but I didn’t know how to say it. It’s so hard to communicate how this stuff affects businesses, without sounding myopic and pedantic. Thanks for writing this, it’s really well explained.
To ease the friction of converting meta values to terms maybe this snippet can be helpful:
https://github.com/deeptiboddapati/convertmetatotaxaterms
Pingback: How to vet a plugin for add on development - Austin WordPress Meetup
Pingback: Custom Fields vs. Custom Taxonomies, when to (not) use? - Meta Box
Isn’t part of the problem with saying “use taxonomies for search” that they still need meta-data to be useful for searching more than a title because these are general purpose objects?
On pricing, I’m particularly worried that categories might be a lower-effort fit, but certainly not a best-case (for the end-user) fit.
As an alternative, you could instead of using a direct meta query, simply wrap these expensive queries in a pre-primed cache for the most common search. By separately priming the cache (perhaps a curl call to a REST endpoint, or admin-ajax), you can relatively swiftly get the result you like.
Alternatives would be some method (I’ve seen lots of them), connecting to a copy of the database in a master->slave setup (I don’t particularly like the terminology, but it’s whats in the mysql manual). This would also allow you to spread the load (search is always going to be read-only).
I’ll concede logging searches then becomes another discussion, but by thinking in blocks of what you want to do, you become less restrictive about how you use WordPress, and open yourself to more options.
No because so are post meta key value pairs. You can still make a key value pair that’s searchable, except the key is the taxonomy, and the value is the term. It’s a matter of imagination, and deciding the trade off. A lot of people are unaware they’re making a trade off with performance and scale as the cost they’re paying.
As for caching, the problem there is it assumes the server can finish the query to begin with. Cache stampedes and race conditions also factor in. As for database load, adding slave databases could get expensive very quickly, especially at scale, but it’s still fundamentally a slow query even in the most optimistic of situations ( a fast server with a single visitor )
I’m also unsure if you mean the same thing when you say search. I mean it as in a database finding posts given X Y and Z, rather than me typing a search term into a search box, or using the ‘s’ query variable in WP_Query. If you need to log post queries or user searches, there are dedicated systems for that such as elastic search + Kibana to name one
This post will save my client time and money – real money. One problem is using dynamic.ooo “view” element in elementor which is just a grid basically they dont allow you to orderby a taxonomy. 🙁
I’m late to this … still, with modern MySQL / MariaDB it’s possible to index wp_postmeta to overcome many of these problems. Here’s a plugin to do that. https://wordpress.org/plugins/index-wp-mysql-for-speed/
Hierarchical taxonomy queries (like in WooCommerce) are still a big problem.
At most indexes mitigate the problem, they don’t solve it. Indexes can help with edge cases, but you can’t keep adding indexes until all the problems go away. If that were the case there’d be no reason for indexes in the first place, indexes have a cost. It’s also not an excuse to just keep using post meta for absolutely everything.
There is a tradeoff here and most people are unaware of it with major performance implications. You can still write expensive multi-dimensional queries using tax_query, that doesn’t mean taxonomies have no value from a performance standpoint.