Tag Archives: ESX

DR Test Results

Recently we performed our annual Disaster Recovery test.  We have learned something very valuable every year and tried to adjust our recovery plans accordingly, with this year being no different.  Even with all the new technology, DR still seems to be a tricky undertaking.

The first year we tested our plan we found that we really, really, REALLY, don’t want to restore Active Directory onto unlike hardware.  The following year we had gotten a node onto our MPLS cloud which allowed us to have a replicated AD server at the DR site.  This greatly reduced the problem of restoring AD.  The year after that we tested our phone system portion of the DR plan and discovered that working with the telco in a DR situation will be challenging at the very least.  In the two years since that second test there have been some major changes that removed our ability to have replicated AD, so we were back to square one on that front.

This year we thought we would do a restore of our two year old VMWare environment.  We had decided to keep the scope to restoring only “Tier 0” service.  This included VMWare ESX , Symantec BackupExec, vSphere, and AD.  Time permitting we planned to restore as many servers as possible beyond the Tier 0 that were the bare minimum.

In the last year we had made the choice to purchase Symantec’s BackupExec Agent for VMware Virtual Infrastructure (AVVI).  This is a BackupExec agent that allows you to backup VMWare Guest OS files directly through the ESX server and/or SAN.  The idea is that we would have our virtualized servers backed up to tape at the VMWare file level and that this would allow us to restore directly back to ESX. Continue reading DR Test Results

Disaster Recovery / Business Continuity

A large part of our PCI and SAS70 compliance is to maintain, and test, a comprehensive and viable Disaster Recovery / Business Continuity plan.  As part of this we will be conducting our annual test of the Technology Availability Plan of our DR plan this coming Friday.  A co-worker and I will be flying to Scottsdale, Arizona where our contracted Disaster Recovery Vendor has it’s data-center that is stipulated for us.

For this test we will be testing VMWare and our ability to recover our vSphere environment.  We will have 3 servers in the test.  The first will be a Windows machine that we will use to install our backup environment and restore data from tape.  The other two machines will be ESX servers that we will setup and configure as our VM hosts.  We will then restore vCenter Server from tape as well as several other critical servers that we call “Tier 0”.

Tier 0 for our DR Plan consists of critical servers that are required to bring the rest of our environment back online in a disaster.  These include, Active Directory, Backup, and a few other infrastructure services that are needed before anything else can be restored.

We hope to have a successful test, and also hope to uncover roadblocks before they become issues in a real world scenario.

Backup Exec 2010 AVVI – Part 2

So it has been several weeks since my Part 1 post on this topic. We are still struggling with all of our servers getting backed up using AVVI.

I enlisted the help of a co-worker and he wrote an excellent vb script that queries the domain for all the servers, and then goes and restarts the VMTools service on every box.  We run this script from our backup server, and it works great.  This centralizes the management of that task, and keeps us from having to mess with batch files on every server, and potentially forgetting to add the task, etc. on a new server. You can download the script if you like. Change the .txt extension to .vbs, and edit the service name at the top.  Edit the mail server settings at the bottom if you wish to get an emailed report of the results.

I believe that this new process has helped, but we are still having issues getting backups from all the machines.  We have found that occasionally the mgmt-vmware service needs to be restarted on the ESX hosts as vCenter has trouble getting the snapshot.  I have not yet taken the time to figure out how to automate this, so it is a manual process at the moment.

vSphere Enterprise Plus

So today, after our little surprise with Virtual SMP or vSMP in vSphere Enterprise, we put in our PO to purchase vSphere Enterprise Plus edition.  Not exactly an inexpensive upgrade when you consider we have 14 processor licenses for ESX.

Then to add a little cream on top the Service and Support piece is a little odd.  VMWare likes to prorate the support you already have.  It seems they use some mystery multiplier that is less than 1 to determine how far the support you have left goes.  We just renewed our support in february, and had to buy another two months worth to get us back to not having to renew earlier next year.  I suppose this isn’t too surprising as the new edition of the product has a higher price tag and therefore the S&S will be higher.

I am indeed looking forward to the Host Profiles as well as the Virtual Distributed Switches.  This will really help us simplify our configuration and really make it easy to switch out ESX hardware when the time comes.  The vSMP will also be nice so that we can have up to the max of 8 processors on any one Guest OS.

All in all a good upgrade, it was just a surprise that we ended up having to go to it.

ESX and Virtual SMP

In my day job I work with VMWare ESX as our server virtualization platform.  We’ve been using this strategy for 18-20 months now.  We’ve been very happy with the scalability and flexibility that this has provided, more importantly the redundancy and HA.  It truly has decreased our costs, and more importantly allowed us to be more agile in our relatively small department of just two administrators.

In our environment we have HP server hardware with NetApp SAN storage running vSphere 4 Enterprise edition.  This has been been a great architecture and has exceeded our expectations.  That was until this week.

We have an application in one of our business lines that has been running for about 4 years now and has been neglected since it was deployed.  Since it has been running without errors it has been at the bottom of our priority list due to budget and resources to address the pain points the platform has.

The most obvious bottleneck on this system is processing power.  This platform is very CPU intensive followed by disk I/O.  These make for long times to complete jobs.  Following this we are behind on the software version the platform uses for its database which we know is holding up efficiency.

We decided to bring this whole server/application into VMWare to provide the redundancy and HA that it offers, as well as be able to use the newer HP multi-core hardware that ESX is running on to assign more logical CPUs to the application.  We did some benchmarking with two, and four cores and found a drastic decrease in processing time.  We wanted to ultimately take it up to the maximum allowed by ESX which is 8 cores.  After editing the settings we tried to start up the guest and we got a pop-up window stating that the host the guest VM was running on did not support the number of cores we had assigned.

This was the frustrating part.  When looking for a reason why we couldn’t run the number of processors we tried we were looking all through documentation and product specs the only reference that seemed to make sense was Virtual SMP.  We didn’t exactly know what this was but there was a clear reference to Four way and Eight way Virtual SMP in the description of the different Levels of vSphere 4.  In the Enterprise version you get ‘Four-way Virtual SMP’ and in Enterprise Plus you get ‘Eight-way Virtual SMP.’  After doing some more poking around it became clear that this meant you can only assign more than 4 logical CPUs to a guest VM if you have the Enterprise Plus edition of vSphere 4.

What this means to us is that we cannot assign the 8 processors to the guest VM like we planned to without purchasing vSphere 4 Enterprise Plus licensing.

You get only a few benefits by going to Enterprise Plus,  the now understood Eight-way ‘vSMP’, the ability to have up to 12 core processors (there are no processors that I know of that come close to this), vNetwork Distributed Switches, and host profiles.

Our problem now becomes the business is already expecting the 8 CPU system, and they have also come to expect the redundancy and HA.  So it seems we’ll be upgrading to Enterprise Plus in our cluster.