Elasticsearch Wrapper for WP_Query
A drop-in replacement for WP_Query to leverage Elasticsearch for complex queries.
Warning!
This plugin is currently in beta development, and as such, no part of it is guaranteed. It works (the unit tests prove that), but we won't be concerned about backwards compatibility until the first release. If you choose to use this, please pay close attention to the commit log to make sure we don't break anything you've implemented.
Instructions for use
This is actually more of a library than it is a plugin. With that, it is plugin-agnostic with regards to how you're connecting to Elasticsearch. It therefore generates Elasticsearch DSL, but does not actually connect to an Elasticsearch server to execute these queries. It also does no indexing of data, it doesn't add a mapping, etc. If you need an Elasticsearch WordPress plugin, we also offer a free and open-source option called SearchPress.
Once you have your Elasticsearch plugin setup and you have your data indexed, you need to tell this library how to use it. If the implementation you're using has an included adapter, you can load it like so:
es_wp_query_load_adapter( 'adapter-name' );
If your Elasticsearch implementation doesn't have an included adapter, you need to create a class called ES_WP_Query
which extends ES_WP_Query_Wrapper
. That class should, at the least, have a method query_es()
which executes the query on the Elasticsearch server. Here's an example:
class ES_WP_Query extends ES_WP_Query_Wrapper {
protected function query_es( $es_args ) {
return wp_remote_post( 'http://localhost:9200/wordpress/post/_search', array( 'body' => json_encode( $es_args ) ) );
}
}
See the included adapters for examples and inspiration.
Once you have an adapter setup, there are two ways you can use this library.
The first, and preferred, way to use this library is to instantiate ES_WP_Query
instead of WP_Query
. For instance:
$q = new ES_WP_Query( array( 'post_type' => 'event', 'posts_per_page' => 20 ) );
This will guarantee that your query will be run using Elasticsearch (assuming that the request can and should use Elasticsearch) and you should have no conflicts with themes or plugins. The resulting object ($q
in this example) works just like WP_Query outside of how it gets the posts.
The second way to use this library is to add 'es' => true
to your WP_Query arguments. Here's an example:
$q = new WP_Query( array( 'post_type' => 'event', 'posts_per_page' => 20, 'es' => true ) );
In one regard, this is a safer way to use this library, because it will fall back on good 'ole WP_Query
if the library ever goes missing. However, because it depends on the normal processing of WP_Query, it's possible for a plugin or theme to create conflicts, where that plugin or theme is trying to modify WP_Query through one of its provided filters (see below for additional details). In that regard, this can be a very unsafe way to use this library.
Regardless of which way you use the library, everything else about the object should work as per usual.
Differences with WP_Query and Unsupported Features
Meta Queries
- Regexp comparisons are not supported. The regular expression syntax is slightly different in Elasticsearch vs. PHP, so even if we tried to support them, it would result in a lot of unexpected behaviors. Furthermore, regular expressions are very resource-intensive in Elasticsearch, so you're probably better off just using WP_Query for these queries regardless.
- If you try to use a regexp query, ES_WP_Query will throw a
_doing_it_wrong()
notice.
- If you try to use a regexp query, ES_WP_Query will throw a
- LIKE comparisons are incongruous with MySQL. In ES_WP_Query, LIKE-comparison meta queries will run a
match
query against the analyzed meta values. This will behave similar to a keyword search and will generally be more useful than a LIKE query in MySQL. However, there are notably differences with the MySQL implementation and ES_WP_Query will very likely produce different search results, so don't expect it to be a drop-in replacement.
A note about WP_Query filters
Since this library removes MySQL from most of the equation, the typical WP_Query filters (posts_where
, posts_join
, etc.) become irrelevant or -- in some extreme situations -- conflicting.
The gist of what happens whn you use WP_Query( 'es=true' )
is that on pre_get_posts
, the query vars are sent to a new instance of ES_WP_Query
. The query vars are then replaced with a simple post__in
query using the IDs which Elasticsearch found. Because the generated SQL query is far simpler than the query vars would suggest, a plugin or theme might try to manipualte the SQL and break it.
Action/Filter | Using ES_WP_Query |
ES_WP_Query Equivalent |
Using WP_Query with 'es' => true |
---|---|---|---|
pre_get_posts |
No issues | es_pre_get_posts |
Potential conflicts |
posts_search |
N/A | es_posts_search |
Should be N/A |
posts_search_orderby |
N/A | es_posts_search_orderby |
Should be N/A |
posts_where |
N/A | es_query_filter |
Potential conflicts |
posts_join |
N/A | Potential conflicts | |
comment_feed_join |
N/A | Potential conflicts | |
comment_feed_where |
N/A | Potential conflicts | |
comment_feed_groupby |
N/A | Potential conflicts | |
comment_feed_orderby |
N/A | Potential conflicts | |
comment_feed_limits |
N/A | Potential conflicts | |
posts_where_paged |
N/A | es_posts_filter_paged , es_posts_query_paged |
Potential conflicts |
posts_groupby |
N/A | Potential conflicts | |
posts_join_paged |
N/A | Potential conflicts | |
posts_orderby |
N/A | es_posts_sort |
Potential conflicts |
posts_distinct |
N/A | Potential conflicts | |
post_limits |
N/A | es_posts_size , es_posts_from |
Potential conflicts |
posts_fields |
N/A | es_posts_fields |
No issues |
posts_clauses |
N/A | es_posts_clauses |
Potential conflicts |
posts_selection |
N/A | es_posts_selection |
Potential conflicts |
posts_where_request |
N/A | es_posts_filter_request , es_posts_query_request |
Potential conflicts |
posts_groupby_request |
N/A | Potential conflicts | |
posts_join_request |
N/A | Potential conflicts | |
posts_orderby_request |
N/A | es_posts_sort_request |
Potential conflicts |
posts_distinct_request |
N/A | Potential conflicts | |
posts_fields_request |
N/A | es_posts_fields_request |
No issues |
post_limits_request |
N/A | es_posts_size_request , es_posts_from_request |
Potential conflicts |
posts_clauses_request |
N/A | es_posts_clauses_request |
Potential conflicts |
posts_request |
N/A | es_posts_request |
Potential conflicts |
split_the_query |
N/A | Potential conflicts | |
posts_request_ids |
N/A | Potential conflicts | |
posts_results |
N/A | es_posts_results |
No issues |
comment_feed_join |
N/A | Potential conflicts | |
comment_feed_where |
N/A | Potential conflicts | |
comment_feed_groupby |
N/A | Potential conflicts | |
comment_feed_orderby |
N/A | Potential conflicts | |
comment_feed_limits |
N/A | Potential conflicts | |
the_preview |
N/A | es_the_preview |
Potential conflicts |
the_posts |
N/A | es_the_posts |
No issues |
found_posts_query |
N/A | Potential conflicts | |
found_posts |
N/A | es_found_posts |
Potential conflicts |
wp_search_stopwords |
N/A | N/A | |
get_meta_sql |
N/A | get_meta_dsl |
N/A |
date_query_valid_columns |
No issues | No issues | |
get_date_sql |
N/A | get_date_dsl |
N/A |
Note that in the "Using WP_Query
with 'es' => true
" column, "no issues" and "N/A" are not guaranteed. For instance, in almost every filter, the WP_Query
object is passed by reference. If a plugin or theme modified that object, it could create a conflict. The "no issues" and "N/A" notes assume that filters are being used as intended. Lastly, everything is dependant on pre_get_posts
. If a plugin or theme were to hook in at a priority > 1000, it could render everything a potential conflict.
Contributing
Any help on this plugin is welcome and appreciated!
Bugs
If you find a bug, check the current issues and if your bug isn't listed, file a new one. If you'd like to also fix the bug you found, please indicate that in the issue before working on it (just in case we have other plans which might affect that bug, we don't want you to waste any time).
Feature Requests
The scope of this plugin is very tight; it should cover as much of WP_Query as possible, and nothing more. If you think this is missing something within that scope, or you think some part of it can be improved, we'd love to hear about it!
Unit Tests
Unit tests are included using phpunit. In order to run the tests, you need to add an adapter for your Elasticsearch implementation.
- You need to create a file called
es.php
and add it to thetests/
directory. es.php
can simply load one of the included adapters which is setup for testing. Otherwise, you'll need to do some additional setup.- If you're not using one of the provided adapters:
es.php
needs to contain or include a function namedes_wp_query_index_test_data()
. This function gets called whenever data is added, to give you an opportunity to index it. You should force Elasticsearch to refresh after indexing, to ensure that the data is immediately searchable.- NOTE: Even with refreshing, I've noticed that probably <0.1% of the time, a test may fail for no reason, and I think this is related. If a test sporadically and unexpectedly fails for you, you should re-run it to double-check.
es.php
must also contain or include a classES_WP_Query
which extendsES_WP_Query_Wrapper
. At a minimum, this class should contain aprotected function query_es( $es_args )
which queries your Elasticsearch server.- This file can also contain anything else you need to get everything working properly, e.g. adjustments to the field map.
- See the included adapters, especially
travis.php
, for examples.