status | Tiger Technologies Blog

Service outage May 6, 2011 (resolved)

Posted May 6th, 2011 in System Status. Tags: all servers, status.

May 6, 4:43 AM Pacific time: An outage at our primary data center caused a complete service interruption for all customers.

Update 5:08 AM: All services have been restored and are working normally.

Read the rest of this entry »

Brief scheduled maintenance on “fry” and “bender” servers (completed)

Posted April 29th, 2011 in System Status. Tags: maintenance, status.

The “fry” and “bender” Web servers will be restarted between 11:00 and 11:15 PM Pacific time tonight (Friday, April 29, 2011). This will cause a five-minute interruption of Web and e-mail service for customers on those servers.

Other servers will not be affected, and incoming mail will only be delayed, not lost.

Read the rest of this entry »

Problem with “fry” server (resolved)

Posted April 26th, 2011 in System Status. Tags: status.

8:52 PM Pacific time: We’re investigating a problem with the “fry” hosting server that’s requiring us to restart it; further details in a few minutes.

Update 9:42 PM Pacific time: The “fry” server was restarted, but a technician will be doing some maintenance on the server for approximately an hour. This will require a reboot, meaning the server will be unavailable for approximately 5 – 10 minutes. Web service will be unavailable during that time. E-mail service on that server also will be unavailable; delivery of new incoming mail will suspend during that time and then resume when the server comes back; no e-mail will be lost.

All others servers are unaffected.

Update 10:50 PM Pacific time: The “fry” web server will be rebooted in about 10 minutes, at approximately 11:00 PM Pacific time.

Update 11:10 PM Pacific time: The “fry” web server was successfully rebooted as planned. There may be more maintenance on the server this weekend; watch our blog or follow us on Twitter for updates.

Network issues April 10, 2011

Posted April 10th, 2011 in System Status. Tags: all servers, status.

Our primary data center experienced network routing problems between 2:06 PM and 2:49 PM Pacific time today (April 10, 2011).

During this time, packets from some (but not all) places on on the Internet were unreliable, causing connection problems. The data center technicians have resolved the issue, and all services are now working normally.

We don’t consider this normal or acceptable, and we sincerely apologize for the inconvenience this caused. (We do not yet have a full explanation from the data center about the root cause, but have requested one so that we can be sure it won’t recur.)

Brief MySQL load problems (resolved)

Posted April 7th, 2011 in System Status. Tags: mysql, status.

We had a couple of instances of MySQL queries overloading the bender server today. The first one happened at about 3:41 AM (Pacific time) and the second one happened at about 7:48 AM. Each occurrence lasted about 20 minutes. The problem each time was that a database was running extremely inefficient queries. Each time we fixed the problem by creating indexes so that the queries could then run in a fraction of the time previously required.

We apologize for any inconvenience caused by this problem. Visitors to your Web site (on the bender server) might have seen reduced performance (or, in rare cases, 503 errors). E-mail was not affected. We don’t consider this type of problem to be acceptable. These problems should not recur since the indexes have been created.

Brief scheduled maintenance on elzar server (completed)

Posted February 25th, 2011 in System Status. Tags: elzar, maintenance, status.

The “elzar” Web server will be restarted at 10 PM Pacific time tonight (February 25). This will cause a five-minute interruption of Web and e-mail service for customers on that server.

Other servers will not be affected, and incoming mail will only be delayed, not lost.

This restart is necessary to fix a memory problem. We apologize for the inconvenience.

Update 10:03 PM: The maintenance was completed with less than 3 minutes downtime.

Network issues January 3, 2011 (resolved, updated)

Posted January 3rd, 2011 in System Status. Tags: network, status.

Between 3:29 PM Pacific time and 3:33 PM Pacific time, our monitoring systems detected that most Internet users could not connect to our primary data center. E-mail delivery was properly queued up and delayed during this period.

We will follow up with the data center team, but the problem appears to have been resolved, and all services are operating normally. We’re continuing to monitor it closely, and we sincerely apologize for the inconvenience this caused our customers.

Updated: connectivity was lost for four minutes because the data center was fighting off a severe DoS attack.

AOL e-mail outage December 21 (resolved)

Posted December 21st, 2010 in System Status, Tales From the Support Team. Tags: AOL, email, status.

AOL.com had an outage lasting about 3 hours last night (from 11:24 PM Pacific time December 20 to 2:28 AM Pacific time December 21). This problem — a failure of AOL’s DNS servers — affected many people sending e-mail to AOL, and wasn’t related to our service (see this report and this one).

However, if you sent mail to an aol.com address during this time, your messages probably “bounced” with an error saying “Host or domain name not found. Name service error for name=aol.com”. If so, you should try sending the message again, and it will work normally. As always, we’ll continue to monitor AOL deliveries closely.

Network issues December 12, 2010 (resolved)

Posted December 10th, 2010 in System Status. Tags: status.

Between 2:35 PM Pacific time and 3:03 PM Pacific time, our monitoring systems detected that connections to our primary data center from some locations on the Internet were slow or failing due to problems at an Internet “backbone”. Connections from other locations were unaffected.

We’re waiting for a full report from the data center team, but the problem appears to have been resolved, and all services are operating normally. We’re continuing to monitor it closely, and we sincerely apologize for the inconvenience this caused our customers.

Service outage Nov. 23, 2010 (resolved, updated)

Posted November 23rd, 2010 in System Status. Tags: all servers, status.

Our primary data center had another power interruption this morning at 7:28 am (Pacific time). All of our servers lost power and then had it restored, thus rebooting them. All customer web sites were unavailable during this time. Incoming email would have simply been delayed during the downtime, not lost. When the servers came back online e-mail may have seemed sluggish to some customers for a while but this should also be fixed now.

This incident follows another power incident the previous Saturday night. We are working with the data center to get more details, including an estimate of when they will have replaced any faulty equipment. We will update this post as more information becomes available.

Update Nov. 29: The final data center report is that on the night of November 20, lightning strikes damaged both of the redundant UPS systems, interrupting data center power for a few seconds. The UPS manufacturer scheduled replacements for November 23, but another PG&E utility power interruption lasting a few seconds occurred that morning before it was finished. The UPS manufacturer has since replaced all damaged parts, restoring full redundancy. In addition, the UPS manufacturer has overhauled each unit, replacing and upgrading other parts to increase robustness. We take this very seriously — it’s at the core of what we do — and we will continue to work with the data center to ensure that their infrastructure meets our high standards.