Bender load problem this morning (resolved)

Starting just after 9AM (Pacific time) today, the “bender” server experienced some very high loads (for about 40 minutes). It seemed to be coming from a combination of severe database, e-mail, and Web server access. Sort of a “perfect storm” of unusual load.

We work very hard to run all of our servers at a reasonable level, with excess capacity to spare. Even though the load was unusual, we don’t consider this type of limitation acceptable. We are reviewing the server’s configuration files to see if we can make changes to avoid this sort of problem in the future.

Apache Web server logging extra “500” errors (fixed)

Our Web hosting customers who use FastCGI have been seeing extra “500 internal server” errors in their logs and statistics since September 12.

The good news is that this is just a logging bug caused by a recent Apache Web server update. Visitors to your site are seeing exactly what they always saw, and there isn’t any problem besides the incorrect logging.

Read the rest of this entry »

Mom server temporarily unavailable (resolved)

Customers on the “mom” server experienced a seven minute interruption in Web site and e-mail service between 4:26 and 4:33 AM Pacific time this morning (September 15).

Customers on other servers were not affected.

Read the rest of this entry »

Brief scheduled maintenance Friday, September 11 (completed)

Between 10:00 PM and 11:59 PM Pacific time this Friday September 11, all our servers will be restarted. As a result, Web site service and the ability to read incoming e-mail will be unavailable for approximately five minutes at some point during this maintenance “window”.

Read the rest of this entry »

Network problem earlier today (resolved)

Some of our customers may have noticed “high packet loss” today from about noon to 12:25 PM (Pacific time). This could make it seem like Web sites hosted on our servers were loading slowly, or even timing out.

The problem has been resolved by our upstream provider, but we are working with them to make sure it doesn’t recur.

Flexo server temporarily unavailable (resolved)

Customers on the “flexo” server experienced a four-minute interruption in Web site service between 9:48 and 9:52 AM Pacific time this morning (August 12).

E-mail was not affected, and customers on other servers were not affected.

The problem happened when the Apache Web server did not respond to a “graceful reload” command when we installed a “mod_security” update to block certain attacks against the WordPress blog software.

We are looking into the root cause of this incident and will take steps to prevent it from recurring. We don’t consider any kind of service interruption acceptable, and we sincerely apologize for the problem.

Denial of service attack update

As we mentioned in an earlier post, someone attacked our network earlier this morning. Although we blocked the attack, we’ve also been working to identify who attacked our network and why. We now know the answer, and we are almost positive that the problem won’t recur.

Read the rest of this entry »

Denial of service attack (resolved)

Beginning at 2:16 AM Pacific time this morning, we began experiencing a “distributed denial of service” attack aimed at our “flexo” Web server.

The attack used more than 2 Gbps of network bandwidth from several thousand different IP addresses. This is an extremely high amount of traffic, saturating even our network connections.

The problem caused most of our servers to become unreachable (or very slow) from the Internet.

We restored service to all servers except the flexo Web server at 2:59 AM (by getting our network providers to block all packets for certain IP addresses). We restored service to the flexo server at 3:29 AM (by getting them to identify and block specific characteristics of the attack).

All services are now operating normally, and all delayed incoming mail has been delivered.

We take reliability seriously. Unfortunately, this is by far the largest attack we’ve seen on our network in ten years. We sincerely regret and apologize for the impact this had on our customers.

Brief scheduled maintenance Saturday, May 2 (completed)

At approximately 11:00 PM Pacific time this Saturday, May 2, the “bender”, “calculon”, “lrrr” and “hypnotoad” servers will be restarted. As a result, Web site and e-mail service for customers on those servers will be unavailable for approximately five minutes.

Read the rest of this entry »

farnsworth hosting server restarted

The “farnsworth” server was restarted at 11:45 PM Pacific time tonight, causing a brief 2 minute interruption in Web and e-mail service for customers on that server. Incoming mail was queued and delivered after the interruption.

Read the rest of this entry »