• Resolved Tony G

    (@starbuck)


    I have not installed your plugin yet, but I plan to try it soon. I have many sites on a VPS and some of them share a MySQL DB server. I would like to avoid the burden of an index database in one of these site DB servers by putting the index data in a different server. Not only would this off-load the save and query of the index data, but it also reduces the size of the site backup significantly. With my current search solution the WP database is huge, mostly with index data.

    Is there a way now or soon where we can index site content and store the index in a different database? For example: server1, sitedb1 has all wp_tables. I’m thinking the load on that server can be reduced if the index is in server2, searchdb1, with all index tables.

    That would create an issue if you’re using wp_query. I wouldn’t mind creating an empty site with a full WP database/tables just to serve as a container for your index data. But I don’t know if we can specify the database with WP_Query … if we can override the database info that is in wp-config.php.

    Thanks!

Viewing 5 replies - 1 through 5 (of 5 total)
  • Plugin Author Epsiloncool

    (@epsiloncool)

    Hi @starbuck

    Yes, the WPFTS uses WP_Query() and it’s the feature that sufficiently simplifies the job for programmers who are looking for a way to improve the search without sufficient modifications of the original code.

    The plugin is developed to work in conjunction with the wp_posts table which means: if you want to use this plugin on another database, you should have a copy of the wp_posts table on the same server where the search index is.

    The same thing, if you want to use taxonomies in wp_query requests, you also need to have taxonomy tables on the “index” server.

    The WP_Query (and whole MySQL, actually) can not work with two different MySQL servers at the same time. The MySQL query always sends a request to one specific MySQL server and there are no possibilities (as far as I know) to specify different MySQL server tables in that MySQL query.

    It sounds not acceptable, right.

    But if someone asked me “how can we have only one huge VPS server to make a search and a bunch of small VPS that can use the search but not have search index on them?”. I would answer that:
    You can set up the complete WP instance with WPFTS on them and a COPY of posts from your “leaf servers” and create a simple API on this server to use it as a “search engine”. It’s implementable and we even had such a project.

    The API should be able to receive new data from “leaf servers”, index those data and make a search on request from the API.

    But you can implement the same scheme using modern C++based (means “very fast”) engines like Elastic Search. It is only a bit complex but will give you much more flexibility.

    Thanks for the question btw!

    Thread Starter Tony G

    (@starbuck)

    I thank you for a fast and complete response. I the two of us think in a similar way : there is always a solution if you really want it, but it might not be pretty. ??

    I was hoping that there might be an option that I did not know on WP_Query($args), like ‘DB_HOST’=>’mydb2.domain.tld’, ‘DB_NAME’=>’wpfts’…

    Let’s be clear : You certainly need WP_Query to build your database because you must read from wp_posts. However, your schema does not need to be anything like the wp_* tables, and I don’t think it is. Are you really using WP_Query to read and write your index tables?

    If yes, then we agree that we can’t use more than one database in a single WP instance. That’s unfortunate.

    If no, if you are not using WP_Query and your index is not a common WP structure, then you might be doing this:

    $mysqli = new mysqli(DB_HOST, DB_USER, DB_PASSWORD, "wpfts_db1"); 
    $result = $mysqli->query("SELECT postid FROM wpfts_IndexFoo ...");

    Other search tools that use other databases (like Elastic Search) need to do that. A search engine for WP doesn’t “need” to store the index data in the WP DB, and it doesn’t need to use the same structures. If you do that, I’m sure it’s just a convenience. I would be surprised if you store indexes in WP CPTs, for example, because I think that would be unnecessarily bulky.

    So to be clear, do you query your index data with WP_Query?

    Thanks again!

    Moderator Dion Hulse

    (@dd32)

    Meta Developer

    Hi @starbuck,

    If you want to use multiple databases, you can use a WordPress plugin such as HyperDB to allow WordPress to connect to multiple database locations. There’s also others such as LudicrousDB which do the same thing.

    It’s mostly used for database replication, for example a primary DB which is where all writes go to, and a secondary server which is replicated from the primary for all read queries. You could also have it setup that the primary gets Read & Write queries, and the secondary just gets Reads, effectively doubling your read query capabilities.

    You could also setup rules to specify where certain tables are, so you could shift just the index tables over to a different database and WordPress wouldn’t know any different BUT if those tables are part of a MySQL join all the tables related to it also have to be in that database.

    I don’t know how this plugin works, other than that it creates extra tables, so I’m hesitant to suggest it can work how you’re envisioning it to, but that simplifying it to being able to have multiple servers acting as a read database might spread the load enough to remove your concerns of performance.
    I guess you could also set up a secondary which is 100% replicated from the primary, and just have all queries which match a search syntax hit the secondary..

    Plugin Author Epsiloncool

    (@epsiloncool)

    Hi @starbuck

    For sure, we are not using WP_Query() to access WPFTS index tables. Our index tables are separate, no WP core tables are used for this.

    But as you know, the whole WP_Query() code has the goal to create one big and good MySQL query to execute it and return post IDs or full posts (depends on parameters). So the result of WP_Query() is post_ids or posts[].

    How does WPFTS work? From version 1.37.x it creates the temporary table and then populates it with (post ids+relevance) pairs that are related to searched string and then WPFTS modifies WP_Query-constructed MySQL query to use this temporary table with LEFT JOIN to filter out those posts which are not related to search string… I hope you understand that <–

    What I think is we normally could store WPFTS index tables somewhere on a different server. But temporary table definitely has to be stored on the WP website’s database to be LEFT JOINed with WP_Query’s SQL.

    You gave me an interesting idea, maybe it’s not so bad and we should implement this scheme. However, you are the FIRST person in 5 years who asked for this.

    May you configure your backup utility to not backup wpftsi_* tables? Will it solve the issue in a simpler way?
    I feel that the “cool feature” to store index separately will not have any fans at all.

    Plugin Author Epsiloncool

    (@epsiloncool)

    This is an example of generated MySQL query

    SELECT SQL_CALC_FOUND_ROWS 
        wp_posts.ID, 
        wpftsi_t.relev 
    FROM wp_posts 
    inner join ( 
        select 
           fi.tid, 
           t_end.relev from <code>wpftsi_index</code> fi 
        straight_join ( 
            select 
                tbase.index_id, 
                sum((trel.<code>pow</code> + (trel.<code>res</code> / LOG(tbase.n + 1)) / 4.791122314940371) * (case tbase.token when "post_title" then 0.9 when "post_content" then 0.5 when "attachments_content" then 0.5 when "cats" then 0.5 when "meta_data" then 0.4 when "post_excerpt" then 0.5 when "tags" then 0.5 else 1 end) / (2)) relev 
            from <code>wpftsi_tp</code> trel 
            straight_join <code>wpftsi_docs</code> tbase 
                 on (trel.did = tbase.id) and (trel.q_id = "348") 
            group by tbase.index_id 
            order by NULL 
        ) t_end 
            on t_end.index_id = fi.id 
        order by null 
        ) wpftsi_t 
        on wpftsi_t.tid = wp_posts.ID 
    WHERE 
        1=1 and (((1))) and 
        (wpftsi_t.relev > 0) AND 
        wp_posts.post_type IN ('buddydrive-file', 'attachment', 'post', 'custom_posttype') AND ((wp_posts.post_status = 'publish' OR wp_posts.post_status = 'inherit')) AND (wp_posts.post_mime_type = 'application/pdf' OR wp_posts.post_mime_type = 'application/pdf' OR wp_posts.post_mime_type = '') ORDER BY (wpftsi_t.relev) DESC, wp_posts.post_date DESC LIMIT 0, 10
    

    As you can see, 3 tables from WPFTS are used directly in the query:
    wpftsi_docs and wpftsi_index contains links between search index parts and wp_posts
    and wpftsi_tp is a temporary table with post IDs found in the WPFTS search index.

    And it looks like 2 or those tables should be placed both at the remote server and wp server. _tp table can be placed on the wp server only and we can populate it remotely from the index server.

    Yes, _docs and _index tables are small (they have the same count of rows as wp_posts does, and only certain amount of columns) so we can duplicate them with no problem.

Viewing 5 replies - 1 through 5 (of 5 total)
  • The topic ‘Can we save index in a different database?’ is closed to new replies.