Protection against viruses that steal FTP passwords

Recently, several customers have told us that pages on their Web sites have been modified without their knowledge. Upon investigation, the customers found their computers had been infected with a virus that steals saved FTP passwords, such as the “Gumblar” or Trojan.PWS.Tupai.A virus.

We’ve taken a step to protect you against this problem (described below), but it’s wise to protect yourself, too.

Read the rest of this entry »

Problem affecting two servers (resolved)

We posted earlier about a problem affecting the elzar Web server. While we were investigating the cause of that, the same thing happened on another Web server, “calculon”, causing a separate outage for customers on that server from 2:34 PM to 2:43 PM Pacific time this afternoon.

During this period, Web sites on that server were unavailable and incoming e-mail was delayed. (The Web server was slow for about six minutes after it was restarted, too.)

On both servers, high disk and memory usage caused the load to skyrocket to the point where they effectively stopped responding.

The good news is that we have narrowed down the cause, so it shouldn’t happen again. A bug in one of our maintenance programs that runs on each server was almost certainly responsible. The bug has been fixed.

We sincerely apologize for this issue, and regret the inconvenience it caused for customers hosted on these servers. Other servers were not affected.

Elzar server temporarily unavailable (resolved)

The “elzar” Web server experienced high load between 5.40 and 6.00 AM Pacific time this morning, April 15. This resulted in slow Web sites and some interruption of service. (Some e-mail activity was delayed, but no e-mail was lost.)

We sincerely apologize for this problem. We consider this type of failure to be unacceptable, and are looking into the cause of the problem so that we can take the appropriate steps to prevent it from happening again.

Zapp server added to brief scheduled maintenance (completed)

As we’ve already posted, some of our Web servers will be restarted tonight at 11 PM Pacific time.

We’re adding the “zapp” Web server to that list so we can replace a RAID array disk that caused a problem on that server earlier today.

Update: The maintenance was completed with less than five minutes of “downtime”.

Zapp server temporarily unavailable (resolved)

The “zapp” Web server was unavailable between 3:43 and 3.53 AM Pacific time this morning, April 4. This resulted in an interruption of service for Web sites on that server. (Some e-mail activity was delayed, but no e-mail was lost.)

We sincerely apologize for this problem. We consider this type of failure to be unacceptable, and are looking into the cause of the problem so that we can take the appropriate steps to prevent it from happening again.

Brief scheduled maintenance Friday, April 3 (completed)

At approximately 11:00 PM Pacific time on Friday, April 3, the “flexo”, “mom” and “elzar” servers will be restarted. As a result, Web site and e-mail service for some customers will be unavailable for approximately five minutes.

No e-mail will be lost, of course; incoming mail will just be slightly delayed.

We apologize for any inconvenience this may cause. This maintenance is necessary to install an updated “kernel” on our servers, as described in an earlier post.

Update: We’re also going to include the “zapp” server in this maintenance to replace a disk in the RAID array.

Update 2: The maintenance was completed with less than five minutes of “downtime”.

Now offering $25 Google AdWords credit (expired 2009-12-31)

We are pleased to announce that all of our customers are now eligible for a $25 credit to start advertising with Google AdWords™.

AdWords ads run alongside or above Google™ search results, so you can reach new customers right at the moment when they are searching for keywords related to the products and services you offer.

This offer expired December 31, 2009.

Avoiding a Linux kernel 2.6.26 cgroup bug

We recently had a server that twice “crashed” and needed manually restarting. We’ve identified the cause of that problem — an apparent bug in Linux kernel version 2.6.26 — and made some changes to ensure that it doesn’t affect our customers again.

However, we didn’t find any information about this problem when searching the Internet, so we’re describing the details here in the hope that it helps someone else.

Read the rest of this entry »

Flexo server temporarily unavailable (resolved)

The “flexo” Web server was unavailable between 9:54 and 10:02 PM Pacific time tonight, March 28. This resulted in an interruption of service for Web sites on that server. (Some e-mail activity was delayed, but no e-mail was lost.)

We sincerely apologize for this problem. We consider this type of failure to be unacceptable, and are looking into the cause of the problem so that we can take the appropriate steps to prevent it from happening again.

Update: The problem happened a second time on March 31 from 6:22 to 6:31 AM. However, the second incident gave our engineers enough details to determine the cause (which we’ve reported in a subsequent blog post), and we have made a technical change that will prevent it from happening again.

favicon.ico files and WordPress

We host some pretty high-volume WordPress sites, and one of the questions that occasionally comes up is “How can I make WordPress faster?”. That’s really just another way of saying “What part of my WordPress site is slow?”, which translates to “What requests are using a lot of CPU time?”

This question is surprisingly difficult to answer, particularly because we encourage customers who run busy WordPress sites to use FastCGI and caching. A single FastCGI process can handle lots of different PHP requests, so it’s hard to break down which individual request used what amount of server resources.

To solve this problem, we recently patched our version of PHP to optionally log the CPU time used by each request, even under FastCGI, so we could see what was really happening (patch available here).

What we found was unexpected. On some busy WordPress sites, 20–30% of the CPU time was being used to handle requests for “favicon.ico”. What the deuce?!

Read the rest of this entry »