April 08, 2015

Running vCenter 5.x with SQL 2012 AlwaysOn Availability Groups

After proudly starting the listener on our shiny new SQL 2012 AlwaysOn cluster, I was very eager to get vCenter moved off the brave little single-point-of-failure that is our current SQL server (a 2008 VM sitting in the virtual environment itself). I had done some research ahead of time and thought that AlwaysOn was at least sort-of supported by VMWare for protecting vCenter workloads. However, in my haste to play with a fancy new toy, I must have missed the plethora of blog posts indicating that either a) It's not actually supported at all, or b) Only Failover Clustering (shared storage) - not Availability Groups (non-shared storage) - are supported. And if you are about to do what many have done on the forums and suggest KB1037959 as evidence that they ARE supported, think again - that article is referencing support for running various clustered workloads on vSphere, not running your vCenter DB on clustered systems. Outside of a vague mention of AlwaysOn as a possible third party clustering solution to replace vCenter Heartbeat (e.g. "Best effort support"), I haven't been able to find anything official one way or the other.

But the AlwaysOn cluster was ready to go and if no one is going to tell me explicitly that I can't do it - well, that's basically an open invitation.

Here's the setup I had to work with:
  • ServerA - Win2012R2 - vCenter Server 5.1 U3 with SSO and Update Manager
  • ServerB - Win2008R2 - SQL 2008 Enterprise
  • DB1 - Win2012R2 - SQL 2012 (Primary)
  • DB2 - Win2012R2 - SQL 2012 (Secondary)
Some basic prerequisites to have in place:
The steps I followed are relatively straightforward if you are comfortable working in SQL Management Studio. Most of these are detailed in KB7960893, but I found a few extra steps were necessary. Obviously your mileage may vary - follow along at your environment's own risk:

Back Up Databases

  1. Stop all VMWare services on ServerA including SSO if used.
  2. Perform a backup of all VMWare databases on ServerB using SQL Management Studio. There are usually three - RSA (Single Sign On), upmgr (Update Manager), and VCDB (vCenter).

Restore Databases and Logins

The database restore is pretty straightforward, except that we will need to recreate the logins that will be used for vCenter ODBC connections and SSO and attach them to their orphaned users in the restored database.
  1. Restore the databases to DB1 - your primary SQL AG member.
  2. Referencing ServerB, recreate the SQL login for your vCenter user. Map the user to the default msdb database with db_owner permissions.
  3. On the vCenter database, associate the orphaned vCenter user with the login you just created using the command:  ALTER USER username WITH LOGIN = username 
    Replace 'username' with your vCenter DB username.
  4. Referencing ServerB, recereate the SQL login for the SSO users (rsa_user and rsa_admin or rsa_dba). Do not map the user to any database.
  5. On the SSO database (RSA), associate the orphaned users with the logins you just created using the command from step 5.
  6. Repeat the steps to create the users on the secondary SQL server, but don't bother trying to associate the orphaned accounts - secondary servers are read-only.

Add Availability Database

Next, add the three databases as Availability Databases. They will need to be in full recovery mode - if you have to switch them over, you'll also need to make a full backup before adding them to your Availability Group.

Restore the SQL Agent Rollup Jobs

I'll just reference KB1004382 here, which gives a good step-by-step walk through of how to get this done. There are some sql query files located on the vCenter server that create these jobs which perform rollups and clean the database. These queries are normally run automatically when vCenter is installed, but since we are just moving the DB, we will have to recreate several jobs manually. The important thing to note is the first step which explicitly states that you must be connected as the vCenter user in SQL Management Studio when running the queries. This way the jobs are created with the vCenter user as the owner. You'll also need to have the SQL Agent service running for these to do any good. Don't repeat this step on the secondary SQL server (see my final note below).

Update the ODBC Connections & vcdb.properties File

Back on ServerA, remove and recreate your ODBC connections using the new Availability Group listener information. Make it easy on yourself and use the same name for the ODBC connection. Be sure to create the new connections using the right version of SQL Native Client for SQL 2012 (11.0).

Next, modify the vcdb.properties file located at C:\ProgramData\VMware\VMware VirtualCenter for Server 2008/2012.

Update SSO Settings

Assuming the host name and/or port changed for SSO also (the RSA database), you'll need to follow KB2033516 to update the settings.

Completion

Start your services back up (starting with SSO) and you should be all set! 

One final note: The orphaned accounts on the secondary server and lack of any rollup jobs there mean that achieving an unattended failover may be unlikely. I'm not familiar enough with AGs yet to be able to say whether the jobs can be safely created on the secondary and allowed to sit and accomplish nothing while it is inactive, or if the orphaned accounts will actually cause any issues, but I'm erring on the side of caution and assuming that a little manual intervention may be necessary.

Post in the comments if you have a suggestion/correction!

UPDATE: Our Veeam backups started to fail after this change. Turns out Veeam can't backup the vCenter database and will cause VSS errors (VSSControl: Failed to freeze guest, wait timeout). Typically Veeam detects the DB automatically and doesn't include it in the backup, but this wasn't working for us. You'll need to create a manual DB exclusion to get the backups working again. Instructions can be found here: http://www.veeam.com/kb1051 

No comments:

Post a Comment