• Resolved lisota

    (@lisota)


    I like the concept of this cache, and decided to try it on our production servers last night during somewhat modest traffic.

    APCu was caching correctly, at least according to the APCu status page showing hits/misses and memory usage. We were getting about 3500 requests/sec on the APCu cache on each of our two web servers.

    CPU usage on our two web servers did go up by about 25-30%, which I suppose makes sense given how this works.

    However, the load on our MySQL database skyrocketed and we ran in to unusable limitations with maxed out IOPS, high database CPU usage and ultimately maxing out our PHP-FPM pool when it spawned more and more processes.

    I guess I can provision more IOPS on our DB server and try again, but any guidance here? The improved performance over memcached or redis might come at the expense of having to trick out the database server to support this style of caching.

Viewing 6 replies - 1 through 6 (of 6 total)
  • Plugin Author Daniel Bachhuber

    (@danielbachhuber)

    Hi @lisota,

    The expected behavior is higher than average DB usage as the beginning, then for writes to taper off as the cache becomes fully populated in the DB.

    However, if your cache usage isn’t optimized and WordPress is writing to the cache more often than it should, then you’ll see continued high usage of the DB. Furthermore, any calls to wp_cache_flush() will cause the entire cache to be dropped and repopulated, so you’ll want to make sure to audit your codebase for this.

    If you were to install WP Redis instead, you could inspect the read/write ratio to Redis to ensure all cache keys were being used appropriately.

    tl;dr: it’s relatively easy for plugins or themes to abuse the object cache, and cause problems across the stack.

    Also, it’s worth noting Pantheon isn’t running WP LCache in production because we’ve discovered a problem with APCu’s memory allocation algorithm. If your configuration gets to a point where APCu is >95% full and heavily sharded, then APCu can cause your PHP processes to lock up and your web node to stop serving requests. This is a problem specific to APCu, and not LCache, so you could see it manifest with any APCu-based object cache drop-in.

    Thread Starter lisota

    (@lisota)

    Thanks. I’ll watch for updates regarding APCu and take a look at our cache writes and reads.

    For users of cloud databases like AWS RDS or Azure MySQL, this cache may need some guidance around database IOPS, since you have to explicitly provision more IOPS on those platforms, and the way this works has the potential to dramatically increase IOPS requirements for the database, as it is using it as an “L2” caching mechanism.

    hey @danielbachhuber , you mentioned “a problem with APCu’s memory allocation algorithm”. Is this something the APCu team is aware of, an issue thread somewhere treating this? I can’t find it on their github.

    Also, what’s your recommendation to a better alternative to APCu since from all that I have been reading, it’s quite not the time to use APCu after all.

    Plugin Author Daniel Bachhuber

    (@danielbachhuber)

    you mentioned “a problem with APCu’s memory allocation algorithm”. Is this something the APCu team is aware of, an issue thread somewhere treating this? I can’t find it on their github.

    I’m not sure whether the APCu team is aware of the issue. We, sadly, haven’t opened a public issue about it, but I’ll bug David Strauss about doing so. He has the most qualified perspective on the issue.

    At a high level, the memory allocation problem manifests itself when the APCu is heavily fragmented and close to full (> 99% utilization). It reproduces consistently if you can mock the environment correctly. The symptom of the issue is that PHP processes begin to lock up when APCu can no longer write new cache entries. One process locking causes all of the other processes to lock because APCu becomes in accessible.

    Here’s a script you can use to reproduce the issue in a test environment: https://gist.github.com/danielbachhuber/e0b39ae91a6a2b7ab79232df145f8b5a

    Also, what’s your recommendation to a better alternative to APCu since from all that I have been reading, it’s quite not the time to use APCu after all.

    Pantheon started working on a SQLite-based L1 cache, which presented its own set of problems. From there, they started working on a PHP extension for RocksDB. I’m unsure of the current state of that.

    Hi @danielbachhuber,
    thank you very much, appreciate your awesome reply. And for the gist.

    I’m at the moment looking for a way to cache Data/Object cache for wordpress, not looking for the likes of Memcached/Redis, but more in the APCu principle (for single machine).

    Since APCu is having all these issues, I”m not going to try it out.

    On a side note, do you think Object Data caching could be implemented using something like tmpfs?

    Plugin Author Daniel Bachhuber

    (@danielbachhuber)

    I’m at the moment looking for a way to cache Data/Object cache for wordpress, not looking for the likes of Memcached/Redis, but more in the APCu principle (for single machine).

    Since APCu is having all these issues, I”m not going to try it out.

    On a side note, do you think Object Data caching could be implemented using something like tmpfs?

    Could, although I don’t think the location of where the data is stored is the hard part. At the end of the day, everything is stored in memory or on disk. The hard part is everything else associated with the implementation.

Viewing 6 replies - 1 through 6 (of 6 total)
  • The topic ‘Heavy IOPS and database use’ is closed to new replies.