PUCK Outage Information « Jared Mauch's Comments

PUCK Outage Information

So, we often reboot machines with little to no consequences. We reboot our phones, cars, laptops, desktops and even servers. This uneventful thing isn’t what happened to me on Monday.

So, many years ago I moved my machine out of my home and decided it would be a good idea to pool resources with several other people for whom I was either hosting or sharing space with. Being a technology person, I had a T1 at my home from 1997-2010. Friends, and other would share resources with me and I returned the favor in-kind.

I have used a variety of technology over the years from the FreeBSD jail support in 4.8 (with a patch) up to the FreeBSD 7-8 series. Due to personal preference and my desire to spend less time compiling things (Plus the fact that I disagree with FreeBSD packaging, development and have had problems with modern hardware support…) I undertook building a replacement host in 2011.

FreeBSD jail can be quite elegant. You could run multiple servers on one physical hardware, share the pool of disk space, cpu and memory all without being limited to #cpu or memory footprint within a virtual machine as you are with vmware and other systems. Having used vmware in some form since my original 1.0.x license that expired in 1999, I wanted to provide a reasonable service to those I shared with.

I went and moved the system to Linux and the closest thing I could find at the time that wasn’t going to limit the CPU/memory/disk usage was Linux-Vserver.org. This required a small kernel package and was distributed as part of Fedora in the base OS without trouble. There were a few limitations to management, but I was willing to live with them at the time and proceeded to move over ~7 machines to the new hardware. Sometimes I would stand up something for a friend then tear it down, but on Monday there were a total of 8. (One I have left down until that the owner contacts me ..).

So during the Monday reboot, the goal was to upgrade the IPMI interface on the motherboard (SuperMicro X9SCA-F) as well as various firmware on the SAS controller.

What happened next was something that would consume me for the next 48 hours.

Upon rebooting the system, the virtual machines would not start properly. I went and tried to upgrade/downgrade the related packages. Rebuild with the latest kernels and modules… I waited through a very long BIOS and SAS boot up and initalization process (it takes ~45 seconds for the mpt2sas driver to probe my 4 disks) each time I rebooted the machine. When I typed “shutdown -r now” the IPMI interface would show the system actually powered off instead of rebooting. When you are sleep deprived and feeling a small bit of pressure, these small things worse.

At some point approaching 24 hours into the process the decision was made to just move all the systems into VirtualBox. You can judge and whatnot, but it was easy. It was free, and I found documentation online about using qemu-nbd to be able to mount and rsync/move the files from the ~1.8TB /home partition that had puck.nether.net and the other hosts over.

Well, in theory. When I built the system, it was the height of the hard drive shortage. I was also “cheap” and just got 4x1T 7200RPM SATA disks. The case for the chassis is 2U and only has 8 bays. Turns out interesting things happen that slow you down, such as the I/O performance of the RAID 1+0 setup isn’t what you would like. As usual, linear reads can run fast, but the lots of random files that people collect on their systems take a long time to stat() as part of that rsync process. The disk cache never seems like enough, and most filesystems don’t perform well under this load.

After trying to rsync the data over with qemu-nbd, it turned out this was corrupting the new VM vdi file filesystem. One system took 3-4 tries to get it recovered right and I finally had to destroy the file and redo everything. Trying to run 7 parallel rsyncs as well? Will cause some really high numbers with iostat -x … you will see read/write wait times approaching 10+ seconds. I’ve seen some mean numbers this week, and those felt like they were slowing me down. Turns out doing them one-at-a-time may have worked out better, but I was hoping the OS disk cache would work better than it did… Also, when you see these long iowait times, it’s enough to cause an OS in VirtualBox (at least) to time out the emulated disk and reset the internal disk controller(!). This was not expected.

After many hours in the process I decided to take a nap Tuesday morning and got in about 3 hours of sleep. Tuesday night, I got more as I waited for the syncs to happen. Sometimes it’s just OK to leave something down and broken for a bit longer. Nobody was “really” screaming about things, but I felt obligated to fix it ASAP.

Of course, once I started to get the machines turned up there were the inevitable problems. Mailman bounced a lot of mail as it wasn’t permitted by smrsh, but the user email worked ok. The load average on the new VM went very high during the mail processing and would periodically reject the messages.

There’s a lot more that could be included but I wanted to highlight a few last things.. having more spindles good. Having friends that will look at something when you are sleep deprived is good. Perhaps using a VM isn’t as evil as I had originally thought, but still isn’t my first choice. Taking a nap and leaving things broken? Good.

Having a wife that is understanding and didn’t shoot me? Very good. I don’t think she often realizes how much she is appreciated, but she is more than I will share in public here.

Hope everyone is having a better week.. I promise to not upgrade anything else for the next 15 minutes.

This entry was posted on Thursday, February 14th, 2013 at 6:29 pm and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Comments are closed.