Last week, I finally managed to bring our Eucalyptus installation to a stable state again, after some 2 months of fiddling with it.
Again is the key word here, because we already had it running quite well in the last semester. I have built it in the room E132 as a part time job back then, with some quick node deployment mechanisms thrown in as a bonus. It ran there till the summer. In September we decided to upgrade the capacity by moving to rooms E220 a G3, where new workstations with Xeon processors were recently installed.
As I found out, reconfiguring Eucalyptus for separate CLC (Cloud Controller) and CC (Cluster controller) is not so easy, at least not without complete reinstallation. I’ve had to solve several issues with leftover items in the database, as the master admin didn’t warn me before he unplugged the components. The last of those caused the Storage Controller not to create and attach volumes to instances. I didn’t want to reinstall the CLC completely, because we’d lose user accounts and machine images that way.
And why did it take two months? Normally, one can install a small IaaS cloud in two days time. But that is when he’s working alone and has access to all the components. That was however not the case. The CC now runs on the main server for the classrooms which also runs NIS, NFS and other services for the students. The admin didn’t like the idea of giving me full access to such an important component and thus we had to do most of the debugging by e-mail.
We also had to rewire the network to fully support Eucalyptus’ VLANs, which took two weeks of waiting for some optical transceivers. They say that the cloud spares you the time you have to wait for components when you want to start a new server, but they don’t mention the time you spend waiting for the components of the cloud itself :-).
Now we’re working on getting the Cloud Gunther to a working state again, so that we can resume running computing tasks on it, but more on that later.