As a result, Web site service and the ability to read incoming e-mail for some customers will be unavailable for approximately five minutes at some point during this maintenance “window”.
At approximately 11:00 PM Pacific time this Saturday, May 2, the “bender”, “calculon”, “lrrr” and “hypnotoad” servers will be restarted. As a result, Web site and e-mail service for customers on those servers will be unavailable for approximately five minutes.
At approximately 11:00 PM Pacific time on Saturday, January 31, all of our Web hosting servers (except the “hypnotoad” and “mom” servers) will be restarted. As a result, Web site and e-mail service for some customers will be unavailable for approximately five minutes.
No e-mail will be lost, of course; incoming mail will just be delayed for a few minutes.
We apologize for any inconvenience this may cause. This maintenance is necessary to install an updated “kernel” on our servers, as described in an earlier maintenance announcement.
Update: the maintenance was successfully completed on all servers with less than 5 minutes of “downtime”.
This morning at 12:11 AM (Pacific time), one of the cabinets at our data center tripped a circuit breaker, causing all of the servers in that cabinet to lose power. Power was restored at 12:18 AM.
Customer Web sites and e-mail on the bender, calculon, lrrr, and zapp Web servers were unavailable during this 7 minute period. The ability to send and receive e-mail was also interrupted (no mail was lost, of course).
We are investigating the root cause of this problem to prevent it from happening again.
This afternoon at 3:49 PM (Pacific time), one of the cabinets at our data center tripped a circuit breaker, causing all of the servers in that cabinet to lose power. Power was restored nine minutes later.
Customer Web sites on the calculon, lrrr, and zapp Web servers were unavailable during this time. The ability to send and receive e-mail was also interrupted (no mail was lost, of course). Other servers were not affected.
We pay close attention to the power load in each cabinet to avoid this sort of problem. The previously measured peak load of that cabinet had been 12 amps. Since the circuit allows 15 amps, this issue surprised us (we’ve been using the same setup in the same data center for seven years and this has never happened before). It appears that a combination of several servers experiencing unusually high CPU loads led to power usage beyond what we previously considered possible.
We will take immediate steps to make sure the problem doesn’t happen again, and we sincerely apologize to customers who were affected by this incident.
Update 7:26 PM: We have removed a server from the cabinet in question, lowering the power use.
Update 10:38 PM: We have removed a second server from the cabinet, ensuring that power use is well below any level that could cause further trouble. The problem will not recur.
The “farnsworth” and “lrrr” Web hosting servers will be restarted at approximately 11:00 PM Pacific time on Saturday October 13, and customer Web sites on those two servers will be unavailable for approximately five minutes. (See “Which server is my account on?” if you aren’t sure.) E-mail service, and customers on all other Web servers, will not be affected at all.
The restart is necessary so we can increase the memory (RAM) on those two servers to 4 GB, as we described here. After this, all our hosting servers will have 4 GB of memory.
By comparison with many other reasonably priced hosting companies, we keep the load on our servers pretty low to start with, so 4 GB is “overkill” probably 99.99% of the time — but we want to cover the other .01%. (Our unofficial motto should probably be something like “Tiger Technologies: We’re paranoid so you don’t have to be.”)
We apologize for any inconvenience this may cause.
Update: Note that the maintenance time has been changed from 10:00 PM to 11:00 PM.
Due to a failure of the power distribution unit (essentially a fancy power strip) in one of the cabinets at our data center, the following services became unavailable at 05:52 AM Pacific time:
(Other Web servers are not affected.) A data center technician is replacing the power unit in that cabinet and all systems should be be back online within 15 minutes; we’ll update this post when that happens.
Update: The faulty hardware has been completely replaced. All servers are back online and functioning normally, and all queued e-mail has been delivered and is available for retrieval. The total outage for these servers was from 05:52 AM to 06:15 AM (Pacific time).
In addition, the FTP service on the “zapp” server was not fully working after it was restarted, so FTP publishing on that server was unavailable until shortly after 7:00 AM. This has been corrected (and the underlying problem that could cause incorrect startup was fixed).
We sincerely apologize to customers affected by this outage. This kind of issue has happened to us only once before in the last seven years (and that was with a different brand of power unit). Since the replacement power unit is brand new, we don’t expect the problem to recur.
- PHP 5.3.25 and 5.4.15
- High load on web04 server May 9 2013 (resolved)
- WP Super Cache and W3 Total Cache security
- WordPress login rate limiting (again)
- Slow performance on web04 server April 11, 2013 (resolved)
- Outage on web12 server April 9, 2013 (resolved)
- Network outage March 23 2013 (resolved)
- PHP 5.3 upgraded to 5.3.22; PHP 5.4.12 also available
- Brief performance problem on web12 server March 4, 2013 (resolved)
- Brief scheduled maintenance February 26 2013 (completed)