I’ve been a little out of touch with this blog in the last month or so. Ever since Thanksgiving things have been crazy, especially at work with the busy season.
Over the last year we have made some great efforts to dramatically increase our stability as well as availability by increasing redundancy to remove single points of failure. This was on many levels including the networking layer by introducing an HA firewall pair, and an HA load balancer pair. We also built out our server infrastructure by implementing 3 web servers for the load balancing, as well as clustering our database hardware and our application server hardware. All of this was intended to be able to easily handle the load of the retail busy season, between Thanksgiving and New Year’s weekend. To be able to really know how much we could handle we wanted to load test the infrastructure top to bottom. Continue reading Unintentional load test →
[UPDATE 3-6-2015:] Check out my newly posted Cacti Virtual Appliance. It is much easier to use than MRTG, and has a pre-loaded host template for F5 BIG-IP Load Balancers!
After getting MRTG setup and running in my MRTG Virtual Appliance as I call it, I started setting up all my networking devices for monitoring. One of the devices I really wanted to poll some more advanced data from is our Load Balancer. What I really wanted to be able to see was the number of concurrent connections to the LB and each of the Virtual Servers if possible. This proved to be much more complicated than I had anticipated.
My first problem is that my SNMP software in Ubuntu was not configured correctly. By default the SNMPd was looking for /usr/share/snmp/mib to load the mib files. In the version of Ubuntu that I had the path was /usr/share/snmp/mib2c-data so I had to update the snmp.conf file. Once I did that then SNMP was able to correctly load all the add-on MIBs so that I could have the OID definitions load correctly.
My second problem was the the MIB file that I had gotten from the web was incorrect, or more to the point it was outdated. The search that I did for F5 MIBs returned many hits, but the one that I went to for most of the information I started with was a nice post at vegan.net. Unfortunately, I didnt realize that this was really out of date. As a result the LOAD-BAL-SYSTEM-MIB.txt is invalid with the software version that my F5 is running and fails the OID lookups.
My F5 is hosted so I dont have direct access to the device. I was able to get the hosting company to grab the MIB files from the filesystem of the F5, and then I put them into my /usr/share/snmp/mib2c-data directory. After that my MRTG graphs for Virtual Server connections started working. One mistake that I made in this process was only putting some of the MIB files on my machine. Do yourself a favor and just get ALL the MIBs and load them to your SNMP MIBs folder.
I found a nifty script here called Buils_mrtg_cfg_for_virtual_servers.pl.txt that was able to do an snmpwalk and get all the information about my Virtual Servers that I needed to get current connections and bandwidth on a per VS basis. From there MRTG was up and running with some pretty good stats about concurrent connection rate and bandwidth utilization across all my domains.
In our production environment that we host at Rackspace we have an F5 Big IP load balancer. This is an excellent product that has way more features than we can ever hope to need.
One problem with this setup is that in our development and staging environments we do not have load balancing and this has caused some issues when moving to production. Some of the issues we’ve had stem around session persitence and what server the sessions are landing on. These can be hard to troubleshoot, and if you aren’t seeing them in the Dev or QA processes you are debugging while in production which is not good.
It became pretty evident that we needed to make our staging environment as much like production as we could, so I started to poke around for a virtual load balancer. After a bit of searching I found several that seemed to fit the need, but many of them were not free. With a bit more digging I found that the Zeus Traffic Manager product has a developer licence that is free to use for non production environments. This suits our needs very well as this is just for staging testing.
I downloaded the VMWare template and had the box up in running in no time. The initial web config is quick and easy, and the developer license was good for 1 year. After which I assume/hope I can still get another free one.
The web based configuration is clean an easy to understand. I have never administered a Load Balancer myself, and even so I was able to get all of our staging sites up and running with thier own pools, healthchecks, session persistence settings and everything we have in production. Our QA team tells me that the speed is noticably better and it has already helped us uncover some issues that we have been fighting with our production servers.
All in this has been a great addition and the best price you can hope for.
Take a look and let me know your experience