Category Archives: Technology

Error 500 – Internal Server Error

So for a large part of today my blog has been down.   I have been trying to troubleshoot for a while on my own and have found several suggestions on the web.  Amongst them was ensuring that php5 was being called in my .htaccess file and also by including a php.ini file to set the memory limit. I found a couple of posts on some WordPress forums as well all relating to the same things.  I also found some posts suggesting I disable all my plug-ins which I did by removing their folders from my plug-ins directory.  Still no luck.

I ended up getting frustrated and called my hosting company 1and1.com.  After a few minutes on hold I got through to a rep and she started to run me through everything I had already tried.  She then went looking in my .htaccess files to verify that I had indeed done what I said.  She came back and then asked to put me on hold.  After a few minutes she came back on and told me that it was all working again as expected.

I asked her what had changed and she told me that when I connect via sftp (ssh) I need to ensure that I explicitly close my connection.  She said that if a connection gets hung up and not properly closed that this error/behavior can happen.  I found that to be a bit strange, but my sites are indeed working again, so I will have to watch to see if it happens again.

Disaster Recovery / Business Continuity

A large part of our PCI and SAS70 compliance is to maintain, and test, a comprehensive and viable Disaster Recovery / Business Continuity plan.  As part of this we will be conducting our annual test of the Technology Availability Plan of our DR plan this coming Friday.  A co-worker and I will be flying to Scottsdale, Arizona where our contracted Disaster Recovery Vendor has it’s data-center that is stipulated for us.

For this test we will be testing VMWare and our ability to recover our vSphere environment.  We will have 3 servers in the test.  The first will be a Windows machine that we will use to install our backup environment and restore data from tape.  The other two machines will be ESX servers that we will setup and configure as our VM hosts.  We will then restore vCenter Server from tape as well as several other critical servers that we call “Tier 0”.

Tier 0 for our DR Plan consists of critical servers that are required to bring the rest of our environment back online in a disaster.  These include, Active Directory, Backup, and a few other infrastructure services that are needed before anything else can be restored.

We hope to have a successful test, and also hope to uncover roadblocks before they become issues in a real world scenario.

Registry Hive Recovered?!

We’re experiencing a weird pop-up message on our BackupExec 2010 server.  Every morning (after backups have run) we get a string of errors from Windows indicating that there was an error with the Registry Hive.  The error reads like this:

“Registry hive (file:) C:\WINDOWS\vmware-SYSTEM\vixmntapiXX was corrupted and it has been recovered.  Some data might have been lost.”

This error message is there every morning, and there will be anywhere from 10-20 of them that we have to click through. The only thing that changes is the XX is a number that increments.  As near as I can tell there is nothing wrong with the system and there are no symptoms of trouble other than the messages.

Going off the error message itself, and the fact that BE was running without this error until I turned on VMWare backups, I suspect that this is an error with the Agent for VMware Virtual Infrastructure (AVVI) otherwise know as the VMWare Agent.  I’ve tried some Googling and haven’t come up with much relating to this error specifically.  At this point its really just an annoyance as we have not see anything that would indicate an issue.  I’m just crossing my fingers that restores of data from AVVI will actually work!

We’re doing our Disaster Recovery Test this Friday so we’ll know pretty quick if these VMWare backups will work or not!  I guess we’ll find out.

NetApp LUN Expansion Limit

Yesterday we were working on expanding a drive for one of our SQL servers in VMWare.  Ordinarily we have all our guest drives as .vmdk files, but this SQL server is clustered so it is a Raw Device Mapping (RDM) to a LUN on the NetApp.  We have expanded LUNs like this in the past and not had any issues.  This time we did have issues and it took a bit of searching to figure it out.

This server is not yet in production so some of the volumes were not sized how they were going to be when it goes live.  In this case we were upping a volume to the size it needed to be to go into prod.  This meant going from 40gb up to 500gb to support all the data that would be imported.  At this point we got an error on the NetApp  “New size exceeds this LUN’s initial Geometry.”  After a bit of Googling we found a forum post that NetApp has a limit to the amt you can expand a LUN restricting it to no more than 10x the original size.  I imagine that for most folks this would not be a problem, but if you are like us and going from test/validation to prod on the same LUN and need to expand you could paint yourself into a corner as we did.

We had to blow-away the LUN and recreate it, which ended up being a major pain due to the clustered servers.  The lesson is, size your volumes to an appropriate size initially, or be prepared to do some file copying later on if you reach the limit.