When search engines swarm new posts

We saw an interesting problem today. One of our customers’ Web sites uses WordPress with WP Super Cache to (dramatically) improve its performance. Every time the customer posts new content, though, the site is immediately swarmed by search engines, feeds, robots, and other non-humans retrieving the new post. There are a lot of unnecessary duplicate requests, but even excluding the duplicates there are hundreds of requests arriving almost simultaneously.

Unfortunately, WP Super Cache is configured by default not to serve cached results to any request that contains an “equals sign” in the query string — and the plugin that notifies the other sites of new content is including an equals sign.

So rather than being immediately served from the cache, all of the new requests were run through WordPress PHP scripts, driving up the script usage and causing “503 Service Unavailable” errors for up to two minutes on that Web site (not for other Web sites on the same Web server, though; we have protection against that).

All of the post requests looked something like this:

GET /new-post-name/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+Blog+name

Fortunately, there’s an extremely easy solution, which we discussed in an earlier blog post. If you’re running WordPress with WP Super Cache, be sure to implement this easy fix. The fix also helps in other situations, and can keep your Web site from unnecessary CPU usage.