I've completed rethought the configuration of my environment. It's far from best practice at this point - but it's getting the job done. Ultimately my issue came down to the fact that best practice is obliterating my power bill. Running 4+ ESXi hosts on server grade hardware is just too expensive to keep cool and operate in Sacramento, California.
I've moved to leveraging a larger “Core” host inside the house where I can keep it cool, and a distributed configuration remotely (in the garage) running on my former server hardware. It's configured to where I can run it either on a single host for basic testing; or bring up the full 4 node remote cluster.
Long term goal is to leverage the “Core” environment as my 24/7 platform, and bring up the secondary environment when I need to run DR tests, or I want to replicate data to a “secondary site”.
The new Core system i've built is based on the very popular 2x E5-2670 build that's been going around. I picked up The S2600CP2j motherboard combo from natex.us as well as some eBay components to build out the system. With 128gb of RAM, 2 E5's, 32 cores (with Hyperthreading) and 8x hotswap bays gives me plenty of room to work with for my internal server.
I've got the majority of my systems loaded at this point - glimpse of most of the systems is below -
With all my components loaded up; I've started making some major progress towards building out what my ideal environment looks like for Hybrid Cloud/Automation testing. I'm still in the progress of migrating systems over to this environment - so there are a few things missing still. It's getting there though!
PowerNSX - Special Call-Out
Are you using it? If you're not, you're doing it wrong. PowerNSX is a set of PowerCLI commands to work with the NSX platform, specifically over its REST API. NSX has an insanely mature API built into it, but if you're not from an API background it's really easy to get lost in the weeds. PowerNSX drastically simplifies this by giving you control over the REST API in a way anyone who's worked with Microsoft PowerShell will understand. Create objects for items and pipe magic. It's cut from the same cloth as PowerCLI in that way. I built most of the new environment as code leveraging PowerNSX. Sadly, along the way, I lost the script I used to deploy it before I was able to commit it to my git. Major developer fail on my part.
PowerNSX is incredible. Check it out here - https://powernsx.github.io/. It's incredibly easy to setup, and easy to jump into using.
The primary individuals I've interacted with who develop PowerNSX, Nick Bradford and Anthony Burke, are an awesome pair of guys from VMware's Network Security Business Unit (NSBU). Nick Bradford (@nbradford) is the lead developer of PowerNSX and created the majority of it's functionality. His code and examples can pretty much be found in anything having to do with PowerNSX. Anthony Burke (@pamdom_) run's his own blog with a ton of GREAT information on it at https://networkinferno.net/. They both have a wealth of knowledge and have been awesome about answering random questions on Twitter since their VMworld session “PowerNSX and PyNSXv: Using PowerShell and Python for Automation and Management of VMware NSX for vSphere” hit the streets (an aboslute MUST SEE). They have definitely have created a platform that's added incredible value in the NSX space. Enabling users to manage their NSX environments as code is a huge step forward. It's fundamentally changed how I work with NSX every day in my job.
I'll be doing a specific blog post very soon about how I used PowerNSX exclusively to build my whole lab configuration. This will really drive home the point that I started with; if you're using NSX and not using PowerNSX; you're doing it wrong.
Enterprise'ish NSX Design
My original NSX design was pretty basic. It was just meant to get NSX inside and working within my Homelab. It was cool - it got the job done, but it was nowhere near an accurate simulation environment to compare with my enterprise. The former environment's diagram is below.
A couple big things to note that are pretty bad about the configuration I have above…
- Automation Subnet as an interface hanging off of my Edge Services Gateway (ESG)
- Single DLR for all tenant workloads. Keeps things pretty limited and “flat”. Not horrible; but just not something I can really test multiple DLR's against
- Oversized transit network (/24) for communication between ESG and DLR. It's just not necessary
So I made it a goal to remedy most of this. The updated diagram is below.
Couple of big items to highlight
- Transit Zone switches implemented to segment communication for separate DLR environments
- Transit Zone subnets shrunk down. Still large enough to add additional interfaces for ECMP down the road; but a bit more efficient.
- Transitioned to BGP at the “top” instead of OSPF.
- Seprate DLR's for each major “segment” (Automation, Tenant, Tenant DR)
- Deployed almost exclusively leveraging PowerNSX - managing my NSX environment as code is awesome!
I'll save the nitty gritty details of NSX for my a future blog post. There's plenty more to talk about it here already!
Infoblox - IPAM and DNS
I'm a big fan of Infoblox. Really easy to use platform, great REST API, and awesome integrations out of the box with vRealize Automation now. I configured the virtual appliance for my environment, and integrated it with Microsoft DNS and Active Directory. The downside about the Infoblox virtual appliance is that its timebombed with a 90 day eval edition before you have to reset it and do all your configurations again. I'm working on a REST API web app that will re-push all my hosts in back into Infoblox after the reset. Unfortunately I'm barely learning Python; this one is going to take me a bit…
I work with vRealize Automation every day, building machine deployment workflows, various IT service workflows, Application deployments, and whatever else I can figure out to do. In vRealize 7.1 there is out of the box integrations with Infoblox. The infoblox plugin is pretty great; but before it was a lot of work to get setup. Once it's setup, its pretty bulletproof and the fact that now it's available out of the box makes it pretty awesome.
Other Notable Mentions
- I've migrated all my storage over onto this new server using Xpenology, the community driven version of the Synology platform. Awesome package support, SSD Cache, and all the “cool” features of a Synology storage device - for free - is a WIN.
- I've upgraded my internal Microsoft Certificate Authority to SHA2. Yeah - i'm way late on this.
So What's Next?
Secondary Site -
I need to reconfigure my secondary site to be the same type of configuration that I have within my core. This site is ultimately going to be my disaster recovery destination site. I'll need to retune the NSX configurations and storage at a minimum.
vSphere Data Protection
So i'm way late to the party on this; but holy crap - VDP is based on EMC's Avamar product?! Pretty slick. I'll be setting this up as a backup platform for my environment, and then…
EMC DataDomain VE
EMC's DataDomain Virtual Edition is free for Homelab (non-production). I'd love to get this setup, and then setup DDBoost on my database's to backup the platform's.
RP4VM (RecoverPoint for VM)
Again, free for Homelab (non-production) use. I'll be using this as an alternative to SRM. Being able to configure a number of application and environment based consistency groups and fail them over between my clusters would be pretty slick. Complete overkill for a small homelab; but homelabs are all about excess :)