• Our site uses a custom taxonomy to group pages by content area (e.g. regional and international news). The following query takes about 0.4 s and filters by this custom taxonomy and categories:

    
    SELECT wp_posts.ID
    FROM wp_posts
    LEFT JOIN wp_term_relationships
    ON (wp_posts.ID = wp_term_relationships.object_id)
    LEFT JOIN wp_term_relationships AS tt1
    ON (wp_posts.ID = tt1.object_id)
    WHERE 1=1
    AND wp_posts.ID NOT IN (955051,955206,955241,955142,954997,955261,955205,945949,955216,954926,954629,954970,954201,955028,955155,955026,955095,948822,11078,954545,335679,105100)
    AND ( wp_term_relationships.term_taxonomy_id IN (1,329,461,469,613,1111,1113,1114,1777,1778,1782,1783,1784,1785,1786,1795,1796,1797,1798,1800,1801,2172,2173,2345,2651,2652,2653,2654,2655,2656,2657,2658,2659,2660,2661,2663,2664,2665,2666,2667,2668,2702,2709,2720,2727,2735,2743,2939,3127,6271,6272,8301,11830,20241,23403,55642,78874,95617,101764,101765,103121,162524,162704,172907,178170,196469,226186,232671,238748,257957,257958,310668,316215,336628,336630,336631,388033,421629)
    AND tt1.term_taxonomy_id IN (461295,461296,461297,461301) )
    AND wp_posts.post_type IN ('post', 'page')
    AND ((wp_posts.post_status = 'publish'))
    GROUP BY wp_posts.ID
    ORDER BY wp_posts.post_date DESC
    LIMIT 0, 7
    

    Due to the size of our tables this query is quite expensive. Every post uses our custom taxonomy and multiple categories. Ideally this query could be done without processing huge amounts of data.

    Explain result:

    
    1	SIMPLE	tt1	
        NULL
    	range	PRIMARY,term_taxonomy_id	term_taxonomy_id	16	
        NULL
    	47731	100.00	Using where; Using index; Using temporary; Using f...	
    1	SIMPLE	wp_posts	
        NULL
    	eq_ref	PRIMARY,post_name,post_parent,type_status_date,pos...	PRIMARY	8	web.tt1.object_id	1	41.31	Using where	
    1	SIMPLE	wp_term_relationships	
        NULL
    	ref	PRIMARY,term_taxonomy_id	PRIMARY	8	web.tt1.object_id	6	50.00	Using where; Using index
    

    In this case the returned post will be just a few days old but it gets more complicated in case of pagination.

    Any ideas what we could improve here?

Viewing 5 replies - 1 through 5 (of 5 total)
  • Plugin Author OllieJones

    (@olliejones)

    Thanks for the interesting example! Looking at this, I have some comments.

    — This query is generated by the [WP_Tax_Query class](https://developer.www.ads-software.com/reference/classes/wp_tax_query/); the SQL code generation it uses is quite stylized. So the shape of this query–those long IN (123, 456, 789) clauses–comes from waaaay upstream in the logic.
    — I get a different EXPLAIN than you. But my tables have a bit of fake data in them, and yours have large quantities of real data.
    — The length of those IN-lists looks like it’s somehow triggering full-index-scans.
    — WordPress’s indexing of the taxonomy tables is pretty close to optimal.

    I wonder if you would run a monitor in the plugin (Tools / Index MySQL / Monitors) and do whatever generates that query a few times, then upload the monitor. The plugin’s monitor feature does EXPLAIN on each different shape of query, so we get a lot of good info.

    (Adding indexes is easy. Adding useful indexes? Not so much. That’s why we go to the trouble of helping you collect information about query performance — we need actual examples.)

    • This reply was modified 2 years, 11 months ago by OllieJones.
    • This reply was modified 2 years, 11 months ago by OllieJones.
    Thread Starter Christoph Bratschi

    (@cbratschi)

    I will collect some data later today on the production system. We are monitoring all queries in our theme and report everything above 10 s. If there are any problems we know them. These queries are > 90 % of all slow queries and this happens a few times per day. The backend runs 20 parallel PHP instances and push notifications can cause some spikes (we have lots of push subscribers). On the other end LiteSpeed Cache delivers stale data if possible to reduce such situations but they can still occur.

    The taxonomy query matches a lot of posts: in case of the categories I would guess > 90 % and the region about 70 %. Before the region data was stored in post_meta fields. Switching to taxonomies only doubled the performance, we expected more. There are enough performance guides available but they often only apply to average instances. If you run into special conditions it’s much harder to optimize the whole environment. I was also looking into clustered databases (Percona XtraDB cluster) which sustained higher loads but queries perform equally if not a little bit slower due to the cluster management. Also tried NDB but long texts are not compatible.

    Thread Starter Christoph Bratschi

    (@cbratschi)

    Well, I uploaded the saved monitor.

    Plugin Author OllieJones

    (@olliejones)

    Thanks. I’ll follow up by email.

    Plugin Author OllieJones

    (@olliejones)

    Please don’t hesitate to open another support thread if I can help you further.

Viewing 5 replies - 1 through 5 (of 5 total)
  • The topic ‘Taxonomy Optimization’ is closed to new replies.