Brief scheduled maintenance on web05 & web07 servers May 22, 2013

Between 10:00 PM and 10:59 PM Pacific time Wednesday May 22, 2013, the “web05” and “web07” servers will be restarted. This will cause an eight minute interruption of service for each server at some point during this hour.

Read the rest of this entry »

Stability improvements for a server memory problem

A couple of days ago, one of our Web servers became unstable for an unknown reason and needed to be restarted. This is rare: on average, this happens less than once every five years of uptime per server, so we took it very seriously and launched an investigation.

What we found was that the owner of one of the sites on that server made a mistake that allowed attackers to run their own scripts. That’s all too common, unfortunately, but usually only the single site is affected by this kind of thing. What was surprising in this case was that the script used a previously unknown method of causing problems for other sites running on the server.

As a result of this investigation, we’ve made several changes to our systems to ensure the problem won’t recur. The rest of this post has a detailed technical description of the problem in case it’s useful for others.

Read the rest of this entry »

web07 server restart on February 1, 1012 (resolved)

Our “web07” server needed restarting at 11:36 AM Pacific time on February 1, 2012, because it had been intermittently unable to run some PHP scripts for 22 minutes.

The restart resolved the immediate problem, and a followup post explains what happened and the changes we made to prevent it from happening again.