<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tiger Technologies Blog &#187; System Status</title>
	<atom:link href="http://blog.tigertech.net/category/system-status/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.tigertech.net</link>
	<description>Behind the scenes at tigertech.net</description>
	<lastBuildDate>Tue, 31 Aug 2010 22:24:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>High load on some servers (resolved)</title>
		<link>http://blog.tigertech.net/posts/issue-2010-08-25-resolved/</link>
		<comments>http://blog.tigertech.net/posts/issue-2010-08-25-resolved/#comments</comments>
		<pubDate>Thu, 26 Aug 2010 03:41:44 +0000</pubDate>
		<dc:creator>Robert Mathews</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[Tech Corner]]></category>
		<category><![CDATA[amy]]></category>
		<category><![CDATA[flexo]]></category>
		<category><![CDATA[leela]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1374</guid>
		<description><![CDATA[Three of our Web hosting servers (amy, flexo, and leela) experienced high load earlier today that caused some customers to see &#8220;503 errors&#8221; on their Web sites for a few minutes. This was caused by an upgrade to the eAccelerator PHP caching system that removed all the cached files at once, which doesn&#8217;t normally happen. [...]]]></description>
			<content:encoded><![CDATA[<p>Three of our Web hosting servers (<a href="/posts/which-server/">amy</a>, <a href="/posts/which-server/">flexo</a>, and <a href="/posts/which-server/">leela</a>) experienced high load earlier today that caused some customers to see &#8220;503 errors&#8221; on their Web sites for a few minutes.</p>
<p>This was caused by an upgrade to the eAccelerator PHP caching system that removed all the cached files at once, which doesn&#8217;t normally happen.</p>
<p>The problem has been permanently resolved and will not recur.</p>
<p><span id="more-1374"></span></p>
<p>A technical explanation for why this caused trouble is that the sudden large number of disk writes caused by new eAccelerator files made the Linux kernel decide that disk &#8220;buffer&#8221; memory was so full that all disk writes needed to happen &#8220;synchronously&#8221;.</p>
<p>That caused the MySQL database to start writing temporary &#8220;filesort&#8221; data to the actual RAID array on the server, instead of just storing those files in memory (as Linux usually does for files that exist for less than a few seconds before being deleted). Some of our servers handle hundreds of MySQL queries a second, and the extra disk writing load overwhelmed the &#8220;/tmp&#8221; filesystem, slowing down MySQL dramatically.</p>
<p>We&#8217;ve made three changes to prevent this from happening again:</p>
<ul>
<li>We&#8217;ve modified our Debian eAccelerator package to not remove all the cached files at once during a future upgrade.</li>
<li>We&#8217;ve changed where MySQL stores temporary files. It now uses &#8220;/dev/shm&#8221; shared memory instead of &#8220;/tmp&#8221;. (Ironically, &#8220;/tmp&#8221; used to be shared memory on our servers, but we had to change it to a real disk-based filesystem because it would fill up with large amounts of data if the server wasn&#8217;t restarted for months. That past experience gives us some assurance that this MySQL change won&#8217;t cause problems, though &#8212; and in fact, we&#8217;ve been testing this change on a small number of servers for some time anyway as a general performance improvement.)</li>
<li>On our servers that support it, we&#8217;re now using &#8220;AMD64/Intel 64&#8243; kernels that allow much larger disk memory buffers before the kernel switches to synchronous disk writes, avoiding the problem a different way. Some servers are already using the improved kernel (sadly, not these three servers), and all of our 64-bit-capable servers will be using it after the scheduled maintenance this coming Saturday.</li>
</ul>
<p>We sincerely apologize for this incident. Don&#8217;t hesitate to let us know if you have any questions.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/issue-2010-08-25-resolved/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brief scheduled maintenance Saturday, August 28 (completed)</title>
		<link>http://blog.tigertech.net/posts/maintenance-2010-08-25/</link>
		<comments>http://blog.tigertech.net/posts/maintenance-2010-08-25/#comments</comments>
		<pubDate>Wed, 25 Aug 2010 20:02:50 +0000</pubDate>
		<dc:creator>Robert Mathews</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[all servers]]></category>
		<category><![CDATA[maintenance]]></category>
		<category><![CDATA[server updates]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1371</guid>
		<description><![CDATA[Between 10:00 PM and 11:59 PM Pacific time this Saturday, August 28, all our hosting servers will be restarted. As a result, Web site service and the ability to read incoming e-mail will be unavailable for approximately five minutes at some point during this maintenance “window”. No e-mail will be lost, of course; incoming mail [...]]]></description>
			<content:encoded><![CDATA[<p>Between 10:00 PM and 11:59 PM Pacific time this Saturday, August 28, all our hosting servers will be restarted. As a result, Web site service and the ability to read incoming e-mail will be unavailable for approximately five minutes at some point during this maintenance “window”.</p>
<p><span id="more-1371"></span></p>
<p>No e-mail will be lost, of course; incoming mail on those servers will just be slightly delayed.</p>
<p>We apologize for the inconvenience this causes. This maintenance is necessary to install an updated “kernel” on all of our servers for security reasons.</p>
<p><em>Update: The maintenance was completed by 11:10 PM with no more than three minutes of &#8220;downtime&#8221; per server.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/maintenance-2010-08-25/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Comcast network problems August 12 (resolved)</title>
		<link>http://blog.tigertech.net/posts/routing-comcast-2010-08-12/</link>
		<comments>http://blog.tigertech.net/posts/routing-comcast-2010-08-12/#comments</comments>
		<pubDate>Fri, 13 Aug 2010 05:47:31 +0000</pubDate>
		<dc:creator>Robert Mathews</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1313</guid>
		<description><![CDATA[Our monitoring systems are showing that some people who reach our servers via an “Internet backbone” company called Global Crossing, including some Comcast cable customers, have been intermittently unable to connect over the last hour or so. This isn&#8217;t an outage on our end; these visitors are also unable to reach other sites that Comcast [...]]]></description>
			<content:encoded><![CDATA[<p>Our monitoring systems are showing that some people who reach our servers via an “Internet backbone” company called Global Crossing, including some Comcast cable customers, have been intermittently unable to connect over the last hour or so.</p>
<p>This isn&#8217;t an outage on our end; these visitors are also unable to reach other sites that Comcast routes through Global Crossing (and not related to us), such as <a href="http://www.globalcrossing.com/">www.globalcrossing.com</a>. It&#8217;s something Comcast and Global Crossing need to address.</p>
<p>We&#8217;ll continue to monitor this issue closely and post an update when we&#8217;re confident that it&#8217;s been resolved.</p>
<p>By the way, if you ever find that you&#8217;re unable to connect to our servers (or anyone else&#8217;s), a very useful site is <a href="http://checksite.us/">CheckSite.us</a>. It shows you whether the destination servers are down, or whether the problem is just a local routing problem that isn&#8217;t affecting most other people.</p>
<p><em>Update 9 AM PDT August 13: According to our monitoring systems, Comcast resolved this shortly after our post, and the problem has not recurred in the ten hours since then.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/routing-comcast-2010-08-12/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brief scheduled maintenance Monday, August 2 on some servers (completed)</title>
		<link>http://blog.tigertech.net/posts/2010-08-02-maintenance/</link>
		<comments>http://blog.tigertech.net/posts/2010-08-02-maintenance/#comments</comments>
		<pubDate>Mon, 02 Aug 2010 22:45:17 +0000</pubDate>
		<dc:creator>Robert Mathews</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[bender]]></category>
		<category><![CDATA[elzar]]></category>
		<category><![CDATA[farnsworth]]></category>
		<category><![CDATA[lrrr]]></category>
		<category><![CDATA[maintenance]]></category>
		<category><![CDATA[mom]]></category>
		<category><![CDATA[seymour]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1273</guid>
		<description><![CDATA[Between 11:00 PM and 11:59 PM Pacific time tonight (Monday August 2), several of our hosting servers will be restarted: bender, elzar, farnsworth, lrrr, mom, and seymour. As a result, Web site service and the ability to read incoming e-mail for some customers will be unavailable for approximately five minutes at some point during this [...]]]></description>
			<content:encoded><![CDATA[<p>Between 11:00 PM and 11:59 PM Pacific time tonight (Monday August 2), several of our hosting servers will be restarted: <a href="/posts/which-server/">bender</a>, <a href="/posts/which-server/">elzar</a>, <a href="/posts/which-server/">farnsworth</a>, <a href="/posts/which-server/">lrrr</a>, <a href="/posts/which-server/">mom</a>, and <a href="/posts/which-server/">seymour</a>.</p>
<p>As a result, Web site service and the ability to read incoming e-mail for some customers will be unavailable for approximately five minutes at some point during this maintenance &#8220;window&#8221;.</p>
<p><span id="more-1273"></span></p>
<p>No e-mail will be lost, of course; incoming mail on those servers will just be slightly delayed. Customers using other servers will not be affected.</p>
<p>We apologize for the inconvenience this causes and for the short notice. Restarting these servers now is necessary to prevent a potential issue that could cause problems with these servers if a RAID disk needs replacing in the future. (We hope to provide technical details in a future post. The problem &#8212; an issue with the <a href="http://en.wikipedia.org/wiki/GNU_GRUB">GRUB bootloader</a> &#8212; is interesting and the solution would be useful to others.)</p>
<p><em>Update: The maintenance was completed with less than 5 minutes &#8220;downtime&#8221; per server.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/2010-08-02-maintenance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brief maintenance on calculon server (completed)</title>
		<link>http://blog.tigertech.net/posts/brief-maintenance-on-calculon-server-2010-07-05/</link>
		<comments>http://blog.tigertech.net/posts/brief-maintenance-on-calculon-server-2010-07-05/#comments</comments>
		<pubDate>Mon, 05 Jul 2010 22:50:05 +0000</pubDate>
		<dc:creator>Robert Mathews</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[calculon]]></category>
		<category><![CDATA[maintenance]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1267</guid>
		<description><![CDATA[The “calculon” Web server will be restarted at 9 PM Pacific time tonight (July 5). This will cause a five-minute interruption of Web and e-mail service for customers on that server. Other servers will not be affected, and incoming mail will only be delayed, not lost. We apologize for the problem and for the short [...]]]></description>
			<content:encoded><![CDATA[<p>The “<a href="/posts/which-server/">calculon</a>” Web server will be restarted at 9 PM Pacific time tonight (July 5). This will cause a five-minute interruption of Web and e-mail service for customers on that server.</p>
<p>Other servers will not be affected, and incoming mail will only be delayed, not lost.</p>
<p><span id="more-1267"></span></p>
<p>We apologize for the problem and for the short notice: the restart is necessary to <a href="http://support.tigertech.net/raid-restart">replace a potentially failing disk in the RAID array</a>.</p>
<p>(This is the second disk on that server that has needed replacing this year. We track these things closely, and we are monitoring it to make sure there isn&#8217;t a problem beyond unfortunate coincidence.)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/brief-maintenance-on-calculon-server-2010-07-05/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brief scheduled maintenance Saturday, May 22 (completed)</title>
		<link>http://blog.tigertech.net/posts/maintenance-2010052/</link>
		<comments>http://blog.tigertech.net/posts/maintenance-2010052/#comments</comments>
		<pubDate>Wed, 19 May 2010 19:06:36 +0000</pubDate>
		<dc:creator>Robert Mathews</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[all servers]]></category>
		<category><![CDATA[maintenance]]></category>
		<category><![CDATA[server updates]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1178</guid>
		<description><![CDATA[Between 10:00 PM and 11:59 PM Pacific time this Saturday, May 22, all our hosting servers will be restarted. As a result, Web site service and the ability to read incoming e-mail will be unavailable for approximately five minutes at some point during this maintenance “window”. No e-mail will be lost, of course; incoming mail [...]]]></description>
			<content:encoded><![CDATA[<p>Between 10:00 PM and 11:59 PM Pacific time this Saturday, May 22, all our hosting servers will be restarted. As a result, Web site service and the ability to read incoming e-mail will be unavailable for approximately five minutes at some point during this maintenance “window”.</p>
<p><span id="more-1178"></span></p>
<p>No e-mail will be lost, of course; incoming mail on those servers will just be slightly delayed.</p>
<p>We apologize for the inconvenience this causes. This maintenance is necessary to install an updated “kernel” on all of our servers for security reasons, and we&#8217;re taking that opportunity to install additional memory on some servers, too.</p>
<p><i>Update:</i> All server updates were completed as expected.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/maintenance-2010052/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Network issues (resolved)</title>
		<link>http://blog.tigertech.net/posts/network-issues-2010-04-09/</link>
		<comments>http://blog.tigertech.net/posts/network-issues-2010-04-09/#comments</comments>
		<pubDate>Fri, 09 Apr 2010 18:43:37 +0000</pubDate>
		<dc:creator>Robert Mathews</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1110</guid>
		<description><![CDATA[We&#8217;re receiving reports of network connectivity problems from a couple of customers using the &#8220;Global Crossing&#8221; Internet backbone to reach our primary data center, although most customers are unaffected. We&#8217;re investigating this issue. Update 12:35 PM: Our upstream provider reports that an 8 minute network interruption for some connections, beginning at 11:11 AM Pacific time, [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re receiving reports of network connectivity problems from a couple of customers using the &#8220;Global Crossing&#8221; Internet backbone to reach our primary data center, although most customers are unaffected. We&#8217;re investigating this issue.</p>
<p><em>Update 12:35 PM: Our upstream provider reports that an 8 minute network interruption for some connections, beginning at 11:11 AM Pacific time, was caused by a router failure at Global Crossing. The problem has been resolved.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/network-issues-2010-04-09/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Network slowness for some customers (resolved)</title>
		<link>http://blog.tigertech.net/posts/network-slowness-for-some-customers-resolved/</link>
		<comments>http://blog.tigertech.net/posts/network-slowness-for-some-customers-resolved/#comments</comments>
		<pubDate>Fri, 12 Mar 2010 08:08:52 +0000</pubDate>
		<dc:creator>Robert Mathews</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[all servers]]></category>
		<category><![CDATA[network]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1097</guid>
		<description><![CDATA[Between 7:00 and 7:45 PM Pacific time Thursday night (March 11), we received two reports of slow or nonexistent network connections to sites on our servers. Our automated monitoring systems didn&#8217;t detect any general problems, so the majority of customers were certainly unaffected &#8212; but we suspect that one of the &#8220;Internet backbones&#8221; between the [...]]]></description>
			<content:encoded><![CDATA[<p>Between 7:00 and 7:45 PM Pacific time Thursday night (March 11), we received two reports of slow or nonexistent network connections to sites on our servers.</p>
<p>Our automated monitoring systems didn&#8217;t detect any general problems, so the majority of customers were certainly unaffected &#8212; but we suspect that one of the &#8220;Internet backbones&#8221; between the affected customers and our data center had high packet loss during that period.</p>
<p>Both customers reported that the problem resolved itself by 7:45, and we haven&#8217;t received similar reports since, so there does not appear to be be an ongoing problem. We&#8217;ll continue to monitor it closely.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/network-slowness-for-some-customers-resolved/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brief maintenance on Calculon server (completed)</title>
		<link>http://blog.tigertech.net/posts/brief-maintenance-on-calculon-server/</link>
		<comments>http://blog.tigertech.net/posts/brief-maintenance-on-calculon-server/#comments</comments>
		<pubDate>Sat, 20 Feb 2010 03:47:58 +0000</pubDate>
		<dc:creator>Robert Mathews</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[calculon]]></category>
		<category><![CDATA[maintenance]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1062</guid>
		<description><![CDATA[The “calculon” Web server will be restarted at 11 PM Pacific time tonight (February 19). This will cause a five-minute interruption of Web and e-mail service for customers on that server. Other servers will not be affected, and incoming mail will only be delayed, not lost. We apologize for the problem and for the short [...]]]></description>
			<content:encoded><![CDATA[<p>The “<a href="http://blog.tigertech.net/posts/which-server/">calculon</a>” Web server will be restarted at 11 PM Pacific time tonight (February 19). This will cause a five-minute interruption of Web and e-mail service for customers on that server.</p>
<p>Other servers will not be affected, and incoming mail will only be delayed, not lost.</p>
<p>We apologize for the problem and for the short notice: the restart is necessary to replace a disk in the RAID array.</p>
<p><em>Update 11:03 PM Pacific time: The restart was completed with less than 3 minutes &#8220;downtime&#8221;.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/brief-maintenance-on-calculon-server/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bender server load problem 2010-02-18 (resolved)</title>
		<link>http://blog.tigertech.net/posts/bender-load-20100218/</link>
		<comments>http://blog.tigertech.net/posts/bender-load-20100218/#comments</comments>
		<pubDate>Thu, 18 Feb 2010 19:28:29 +0000</pubDate>
		<dc:creator>Ken</dc:creator>
				<category><![CDATA[System Status]]></category>
		<category><![CDATA[bender]]></category>
		<category><![CDATA[status]]></category>

		<guid isPermaLink="false">http://blog.tigertech.net/?p=1053</guid>
		<description><![CDATA[The “bender” Web server experienced intermittently high load between about 7:40 and 10:15 AM Pacific time this morning, February 18. This resulted in slow or even inaccessible Web sites on that server. (Some e-mail was also delayed before being properly delivered.) Other servers were not affected. This server had similar high load symptoms (but much [...]]]></description>
			<content:encoded><![CDATA[<p>The “<a href="http://blog.tigertech.net/posts/which-server/">bender</a>” Web server experienced intermittently high load between about 7:40 and 10:15 AM Pacific time this morning, February 18. This resulted in slow or even inaccessible Web sites on that server. (Some e-mail was also delayed before being properly delivered.) Other servers were not affected.</p>
<p>This server had similar high load symptoms (but much more briefly) earlier this week. We took some steps to reduce the load then, but it appears those weren&#8217;t sufficient. We&#8217;re now taking much stronger action to ensure that this does not happen again.</p>
<p>We sincerely apologize to customers affected by this problem. We don&#8217;t consider it normal or acceptable, and we will make sure this isn&#8217;t a recurring issue.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.tigertech.net/posts/bender-load-20100218/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic page generated in 0.242 seconds. -->
<!-- Cached page generated by WP-Super-Cache on 2010-09-02 15:45:38 -->
