Registry Hive Recovered?!

We’re experiencing a weird pop-up message on our BackupExec 2010 server.  Every morning (after backups have run) we get a string of errors from Windows indicating that there was an error with the Registry Hive.  The error reads like this:

“Registry hive (file:) C:\WINDOWS\vmware-SYSTEM\vixmntapiXX was corrupted and it has been recovered.  Some data might have been lost.”

This error message is there every morning, and there will be anywhere from 10-20 of them that we have to click through. The only thing that changes is the XX is a number that increments.  As near as I can tell there is nothing wrong with the system and there are no symptoms of trouble other than the messages.

Going off the error message itself, and the fact that BE was running without this error until I turned on VMWare backups, I suspect that this is an error with the Agent for VMware Virtual Infrastructure (AVVI) otherwise know as the VMWare Agent.  I’ve tried some Googling and haven’t come up with much relating to this error specifically.  At this point its really just an annoyance as we have not see anything that would indicate an issue.  I’m just crossing my fingers that restores of data from AVVI will actually work!

We’re doing our Disaster Recovery Test this Friday so we’ll know pretty quick if these VMWare backups will work or not!  I guess we’ll find out.

NetApp LUN Expansion Limit

Yesterday we were working on expanding a drive for one of our SQL servers in VMWare.  Ordinarily we have all our guest drives as .vmdk files, but this SQL server is clustered so it is a Raw Device Mapping (RDM) to a LUN on the NetApp.  We have expanded LUNs like this in the past and not had any issues.  This time we did have issues and it took a bit of searching to figure it out.

This server is not yet in production so some of the volumes were not sized how they were going to be when it goes live.  In this case we were upping a volume to the size it needed to be to go into prod.  This meant going from 40gb up to 500gb to support all the data that would be imported.  At this point we got an error on the NetApp  “New size exceeds this LUN’s initial Geometry.”  After a bit of Googling we found a forum post that NetApp has a limit to the amt you can expand a LUN restricting it to no more than 10x the original size.  I imagine that for most folks this would not be a problem, but if you are like us and going from test/validation to prod on the same LUN and need to expand you could paint yourself into a corner as we did.

We had to blow-away the LUN and recreate it, which ended up being a major pain due to the clustered servers.  The lesson is, size your volumes to an appropriate size initially, or be prepared to do some file copying later on if you reach the limit.