One thing that I just learned overnight is that you really should keep an eye on the Snapshots in ESX/vSphere. We are running the AVVI backups from Backup Exec 2010, and it uses the vSphere storage APIs to do it’s business. BE has vSphere run a snapshot by calling the API and then it grabs that snapshot and sends it to whatever your backup medium is.
In the past I have seen where a snapshot gets left behind and not deleted. Last night I started getting paged from our monitoring system that one of our AD servers was offline. After having to jump through some hoops to get in via VPN (because the AD server was the one used to authenticate and give DHCP to VPN users) I was able to get onto the ESX server. There I saw that the snapshots had hogged up all available disk space on the ESX box and my AD server was stalled as a result. It turns out that the snapshots for my Exchange server were piled up and I had to delete them. Once there was free space again my AD server was back online and everything is OK again.
Now I need to figure out a way to monitor my ESX server for datastore space so that this does not happen again.