Network problem September 29 (resolved)

In an apparent continuation of last night’s incident, many sites we host were intermittently unavailable between 12:01 PM and 1:20 PM Pacific time today (September 29, 2011). This also caused slow mail delivery and reduced spam filtering effectiveness until around 2:00 PM (no mail was lost, of course).

All systems are operating normally as of 2:15 PM.

Read the rest of this entry »

Network problem September 28 (resolved)

A problem at our old data center (the one we’re moving sites from this month) caused some sites to be intermittently unavailable between 10:22 and 10:47 PM Pacific time.

Read the rest of this entry »

High load on the “elzar” server (resolved)

The “elzar” Web hosting server experienced very high load between 9:07 and 9:14 AM Pacific time this morning (September 27, 2011), causing sites on that server to load slowly during those seven minutes. Other servers were not affected.

This was caused by a distributed denial of service (“DDOS”) attack against a site on that server. We manually blocked the attackers to resolve it, and we’re continuing to monitor it closely to make sure it doesn’t recur.

2011 server upgrades

Over the next four weeks, we’ll be migrating customer Web sites to upgraded servers. The servers have updated software (and upgraded hardware in some cases), and are also located in a data center with increased power reliability.

For most customers, these changes will be completely unnoticeable. However, a very small number of customers might notice software differences or experience up to five minutes total of “downtime” at some point. We recommend reading through this entire post for details.

Read the rest of this entry »

September 5, 2011 Labor Day holiday hours

Our business offices will be closed on Monday, September 5 to observe the US Labor Day legal holiday. As always, we’ll provide same-day support for time-sensitive issues via our ticket and e-mail systems. However, questions that aren’t time-sensitive (including most billing matters) may not be answered until Tuesday, and telephone support (via callbacks) will be available only for urgent issues.

The perils of quick tweeting

We’re making a determined effort to post Twitter status updates very quickly if our monitoring systems detect any kind of problem.

Earlier, we tweeted “We’re investigating a possible outage on the mail.tigertech.net server” because one of the multiple external monitoring systems we use alerted us that it was unable to connect to our mail server cluster.

Upon investigation, it turns out to have just been a false alarm. There was a problem with the monitoring system; there was nothing wrong with the mail servers at all. Unfortunately, there’s probably no way to prevent occasional false alarms like this; we’d rather get the information out quickly, and by definition that means posting preliminary information before we’ve had a chance to fully investigate what’s happening.

Behind-the-scenes POP and IMAP mail upgrades

Over the next month or so, we’ll be upgrading the POP and IMAP software we use for e-mail mailboxes. We don’t expect customers to notice any change (except possibly improved speed) or experience any service interruption at all; we’re mentioning it just for completeness.

Read the rest of this entry »

Some WordPress themes (and other software) vulnerable to “TimThumb” bug

A popular piece of software called “TimThumb” (aka “timthumb.php”) was recently found to have a security bug that allows “hackers” to take over Web sites that use it (more info here).

Some popular custom WordPress themes include TimThumb as part of their features, making those themes vulnerable to this problem. (Just so it’s clear, TimThumb isn’t specific to WordPress, but that’s probably where it’s most commonly used.)

If you use WordPress and your Dashboard tells you to update your theme, you should do so right away (in fact, you should always update an outdated theme or plugin right away).

However, we’ve also added security rules to our servers to protect our Web hosting customers who haven’t yet upgraded. Other people may find the rules useful if they use mod_security on Apache Web servers. The rest of this post contains more technical details.

Read the rest of this entry »

Outage at primary data center (resolved)

Between 6:00 AM and 6:29 AM Pacific time August 7, 2011, all services were unavailable due to a power failure at our primary data center.

The problem was resolved for most servers by 6:29 AM, and for all servers except the “amy” server by 6:53 AM. The “amy” server needed extra manual intervention, and was working by 7:55 AM. All services are now operating normally.

Any e-mail that arrived during the outage was queued at our secondary data center and delivered as soon as the outage ended.

We sincerely apologize for this problem. We know you count on us for reliability, and we don’t consider this acceptable, especially since the data center has had previous power problems this year. However, this incident had a different root cause. It wasn’t a utility power failure that the redundant UPS systems didn’t handle, but was instead caused by a circuit breaker incorrectly “tripping” to prevent the power output of the UPS systems from reaching the server cabinets.

Update 4:15 PM: We have received an incident report from the data center indicating that they are working to replace the affected part of the UPS system to prevent further problems.

Brief scheduled maintenance on pazuzu server (completed)

At approximately 11:00 PM Pacific time July 26 2011, the “pazuzu” Web server will be restarted.

As a result, for customers on the “pazuzu” server (only), Web site service and the ability to read incoming e-mail will be unavailable for approximately five minutes. Customers on other servers will not be affected.

Read the rest of this entry »