Clustering is just plain fun. Being able walk around with a clustered system
on my laptop (and trusty, portable, external USB HD sidekick) is a total hoot.
To get an entire cluster working on VMWare 5.0 can be a bit of a bear though,
because there's really a paucity of documentation out there that explains how to
do it. The biggest challenge, of course, being how to simulate shared drive
resources usable by the cluster (i.e. how do you simulate something like fiber
attached to HBAs, or a SAN, etc?).
Happily VMWare does support simulating shared IO/drive resources though
you'll need access to some powerful vodoo.
Build instructions, including the Vodoo, are listed below:
1) Build a Base Machine. Build a base Machine which you can
use as an image for your domain controller, and two (or more) Custered SQL
Servers. (Yeah, I went with the option of making the DC a non-clustered
resource. If you want to, you could just dcpromo both of your nodes and try it
that way....). To do this:
- Just create a brand new VMWare server - specifying Windows Server
2003 Enterprise Edition (or better) as the guest OS.
- Give it about 4 GB of non-fixed disk space.
- Some Ram,
- SCSI drives
- and a bunch of clicks on the Next button.
- Then ensure that the Windows 2003 Enterprise Server Installation Media is
either captured as an .ISO for your VM, or slap the disk into your CD/DVD-ROM
drive.
- Start the VPC, and install Windows 2003 Enterprise..
- During installation, give the box a bland name, something like WIN2K3BASE,
or whatever. (This box won't really be a part of your cluster - you're just
paving it to use as the 'base' for your other boxes.)
- Once the Install is complete. Patch the box.
- Then drop the new SysPrep Tool on it from Microsoft: http://support.microsoft.com/?kbid=838080
- Extract the files, and then just run sysprep.exe.
- It will warn you that it can mess up boxes - click yes, you're TRYING to
mess up your box ;)
- Select Reseal, and click OK. In case you
don't know what you're doing, and somehow just managed to put SysPrep on your
workstation because you thought the NAME was cool, Windows warns you that you
are about to.... PREP your machine. Click Yes (you're mature
enough).
- The VM powers down.
- Once it powers down, from the VMWare menu, select the VM's
Settings. Click into the Options Tab,
and select Advanced. Then put a check mark in the
Enable Template Mode. OK your way out of
there.
- From the VM menu, take a new snapshot. Give it a happy, cuddly name, like
"momma," cuz this is what you'll run home to if everything ends up barfing on
you later on.
- Now, from the VM menu, select Clone. Make sure that you
create a full clone, and not just a linked or forked version of the server
(you want a FULL copy/clone).
- Repeat the cloning process until you have 3 'boxes': A Domain Controller,
SQLServerNode1 and SQLServerNode2. (I prefer Simpson's characters for server
names, and Marge's sisters work out perfectly; Patty and Selma are names
that are uniquely suited to clustering <g>).
2) Build a Domain Controller.
- Fire up the box that will be your DC, run through the happy, abbreviated
setup wizard and give your box a network name you can live with -- Quimby is
pretty authoritative for a DC.
- Once it reboots, give it a static IP address. I prefer to just bind my
boxes to the VMnet2 network. And then do something imaginative like set the DC
to 192.168.235.1, with 255.255.255.0 as the subnet. Add a gateway if you've
got one... etc. (though, if you're like me, you don't care much about your
clustered SQL Server 'domain' being able to chat with external boxes).
- Then, either run DC Promo on it, or use the Add Server Role do-hicky from
manage your server.
- Just 'Next' your way through the wizard, stopping only to
tell it to take care of the DNS (unless you're a glutton for
punishment).
3) Cluster and SQL Server Service Accounts. Create a user
account to use for the cluster and for SQL Server
- While you are still on the DC, go to Start | Administrative Tools
| Active Directory Computers and Users and create a new user in your
domain for the cluster account.
- You can also create another user for the SQL Server account (which is a
good idea in the real world).
- You'll want to make sure that both of these users don't allow the user to
set the password, that the password never expires, and OF COURSE, uncheck that
dumb: User must change password at next logon.
4) Build Out your SQL Server Nodes
- For each of your SQL Server Nodes, you'll want to add a new NIC from the
VM Hardware manager (just select Settings on the VM, and the
use the Add button to launch the Add Hardware wizard).
- Bind the newly added NICs to a custom VMWare Network (I keep all of my
VNICs bound to the VMnet2 network).
- Boot server1. Once it's booted, navigate to the network connections
section of the control panel in Windows Explorer. Right click and rename your
Area Connections to External and Heartbeat.
- Change the IP addy for External to something fixed on your network (like
192.168.235.11 or .12 for server2) with the appropriate subnet mask
(255.255.255.0). Then specify the fixed IP of your
Domain Controller as the Preferred DNS IP
Address (i.e. 192.168.235.1).
- Now change the IP Address for your HeartBeat NIC.
Set it to something like 10.1.1.1 (or 10.1.1.2 for
Server2). Leave the Gateway empty. Leave DNS blank. Click the
Advanced button, and from the WINS tab,
click Disable netbios over tcpip.
- Do the same for Server2 (using different IP addresses, of course).
- Add both servers to your domain (Right Click My Computer |
Properties, and then from the
Network Identification tab click
Change. Specify the credentials you provided when you created
your Domain Controller in the previous steps as the credentials you need to
add the box to the domain.)
- Joining to the Domain switched your External NICS to DHCP. Jump back in
there, and slap them back to the way you had them before. If you get grumped
at by Windows (telling you that there may be a collision, just ignore it (I
think you have to click No to make the nag screen go away)).
- Check out your IP Addresses by running ipconfig /all from the command line
-- just to make sure. Also make sure that you can see that the heartbeat nic
has netbios turned off.
- Power off both boxes once they've had their nics configured and have been
added to the domain.
- Now would be a really good time to snapshot both boxes incase you do
something dumb down the road.
5) Configure Shared Disk Resources. Now comes the juicy
part. The way to spoof your virtual machines into believing that they have
access to shared disk resources. In my quest for clustering I found two
resources on this, one was hideously outdated - but contained enough theory to
help me out, The other
was for version 4.5x and was missing one critical piece of data that is
apparently now needed in VMWare Workstation 5.0. The key concepts here are that
you just need to create SCSI controllers on both machines, and then provide
directives that tell VMWare NOT to lock the disks when they are connected. This
lets machines share disks, as long as all of the SCSI connection info is
configured correctly. From there you just need to, obviously, make sure that the
disks aren't busy trying to dynamically allocate size; i.e. the disks have to be
fixed or each VM will see a different size, state, etc.
The first thing you need to do is create some drives that you'll hook up to
your machines. The best way to do this is to just create them with the wizard by
'adding' them to one of your machines, and then immediately removing them. Think
of it as a virtual-hard-drive-egg-laying-chicken (or just think of it as a way
to make virtual hard drives, if that's easier). To Proceed:
- Open up one of your SQL Servers and Select VM | Settings
from the menu. To add the drives just click Add and use the
Wizard.
- The first drive will be your Quorum drive, and just needs
to be a few hundred MB (200 MB will work fine - or .2GB).
- The wizard steps are as follows: Create a new virtual disk. Next.
SCSI. Next. Disk Size = .x GB Then ensure that
Allocate all disk space now is checked. Click
Next. Browse out and drop the disks in a directory called
SharedDisks (or something). And click the Advanced button.
Make sure that Independent is checked. Then click
Finish.
- Make sure you create two disks (your Quorum drive, and then a Resource
drive (or more)).
- Then select each drive, and Remove it in the Hardware
management thingy. We just needed to MAKE Hard Drives, we don't want to add
them just now. (You'll add them by hand to the machines in a second.)
6) Attaching Shared Disks. Add your virtual shared drives to
the boxes by hand. Now that the drives are sized and created, it's time to head
to the virtual server rack and hook up some virtual SCSI controllers.
- Navigate to the directories where your Virtual Machines are kept, and for
Server 1 open the .vmx file in NotePad.
- First add some instructions for disk control, and to make sure that the VM
won't attempt to lock the drives it connects to:
# Shared Disk Config Info:
diskLib.dataCacheMaxSize = " 0"
diskLib.dataCacheMaxReadAheadSize = " 0"
diskLib.dataCacheMinReadAheadSize = "0"
diskLib.dataCachePageSize = "4096"
diskLib.maxUnsyncedWrites = "0"
disk.locking = "FALSE"
- Then add a new SCSI controller:
scsi1.present = "TRUE"
scsi1.virtualDev = "lsilogic"
scsi1.sharedBus = "virtual"
- Once that's done, add your Quorum Drive (making sure to
specify that the drive itself uses the lsilogic bus (this was the big missing
component between Karl's
post for 4.5x and 5.0. I found this out by trial and error):
scsi1:1.present = "TRUE"
scsi1:1.fileName = "\Quorum.vmdk"
scsi1:1.redo = ""
scsi1:1.mode = "independent-persistent"
scsi1:1.deviceType = "disk"
scsi1:1.virtualDev = "lsilogic"
- Once the first controller and drive is added, just add the second SCSI
controller and disk (making sure to change your paths, etc.:
scsi1:2.present = "TRUE"
scsi1:2.fileName = "\Resource.vmdk"
scsi1:2.virtualDev = "lsilogic"
scsi1:2.redo = ""
scsi1:2.mode = "independent-persistent"
scsi1:2.deviceType = "disk"
- The entire 'snippet' to copy/paste is here:
# Shared Disk Config Info:
diskLib.dataCacheMaxSize = "0"
diskLib.dataCacheMaxReadAheadSize = "0"
diskLib.dataCacheMinReadAheadSize = "0"
diskLib.dataCachePageSize = "4096"
diskLib.maxUnsyncedWrites = "0"
disk.locking = "FALSE"
scsi1.present = "TRUE"
scsi1.virtualDev = "lsilogic"
scsi1.sharedBus = "virtual"
scsi1:1.present = "TRUE"
scsi1:1.fileName = "\Quorum.vmdk"
scsi1:1.redo = ""
scsi1:1.mode = "independent-persistent"
scsi1:1.deviceType = "disk"
scsi1:1.virtualDev = "lsilogic"
scsi1:2.present = "TRUE"
scsi1:2.fileName = "\Resource.vmdk"
scsi1:2.virtualDev = "lsilogic"
scsi1:2.redo = ""
scsi1:2.mode = "independent-persistent"
scsi1:2.deviceType = "disk"
- Make sure, of course, that you specify the full path to your shared Drives
directory (i.e. M:\VirtualMachines\Shared Drives\).
And now you've got shared drive resources! If you're a geek like me, you're
pumped.
7) Ensure Drive connectivity from both Servers.
- Power down BOTH of your server nodes.
- Boot up Server 1.
- Right click My Computer | Manage. Go to the Disk
Management node.
- You should see your two new drives available. (And the Welcome and convert
wizard will probably start up.)
- Make sure both disks are Basic Volumes (Dynamic disks ==
bad). Quick Format them and bind them to Drive Letters (I like
Q and R (quorum and resource) myself).
- Test that you can access the drives from within Windows. (Try creating a
.txt file and fill it with gibberish - then move it to the
other drive, etc.)
- Once everything is peachy, shut the box down.
- Then boot Server 2. From the Disk Managemt node in the
Management Console, you should see both disks. Just 'change'
their drive letters (you'll have to add them AFTER clicking change...).
Give them the same drive letters that you gave them for Server 1.
- Test read/write functionality.
- Shut down the Server.
8) Time to start Clustering. It's also probably a good time
to take a snapshot about now (with both boxes shut down).
- Fire up Server 1. Once it finishes booting, go to Start |
Administrative Tools | Cluster Administrator.
- Select create a new cluster from the dropdown.
- Walk through the wizard which scans your configuration, makes sure that
shareable resources exist, etc.
- Specify a name for the virtual/clustered server, as well as an IP.
192.168.235.3 works well, and Server3 works well as a name if
you are going the bland route. (Note that the cluster itself ends up being
represented as a virtual machine on the network, with an IP Address, a name,
and 'resources' at its disposal. If you put SQL Server (or Exchange for that
matter) then that 'server' will have its own name and IP address (and
resources) in addition to the name and IP of the cluster. (In this manner, a
clustered SQL Server with 2 nodes consumes 6 IP Addresses, and 4 Network
names: 2 HeartBeat addresses (on a private networks). 2 IP addresses for use
by the servers (upon which the cluster is built) along with 2 network names,
and 1 name and IP for the cluster, as well as 1 name and IP for the Clustered
SQL Server (or virtual server as it is called).
- When it comes time specify an account for the cluster service to run
under, use the account you created up in step 3.
- At the final, summary, page of the wizard, there's a
Quorum button. It should be its own screen, but it's just a
button squirreled away on the very last page. Pay attention for
it. Click it, and make sure that it is trying to use your
Quorum drive for the Quorum drive (it's
NOT smart enough to figure it out on its own).
- Once the wizard is complete, the cluster service installs.
- Once that's complete, bring Server 2 online. You can then either open the
Cluster Administrator on Server 2 and Add
Server 2 to an existing node, or return to Server 1 (now also called Server 3
- or the Active node in the cluster) and Add a new server to the Node (Server
2). The wizard is pretty similar, only there aren't as many steps.
Once the second wizard is done, you've successfully created a cluster. Woot!
If you're a true geek, you'll fail it over a few times just to see it in action
(go to the Groups node in Cluster Admin and right click the Cluster Group, then
select Move Group (i.e. move the resource group to another cluster)). Note too,
that as you do this, the Group 0 resource (your remaining shared HD) stays with
the original node (though you can move it too).
9) SQL Pre Configuration
- You'll want to install MSDTC on both servers in the
cluster. Move the Cluster Resources to Server1. (Fail the
resources back to Server1 if you need to.)
- Start | Control Panel | Add/Remove Programs | Add/Remove Windows
Components.
- Select Application Server, and then click
Details. Then select Enable network DTC
access and click OK (boy is this tons easier than
configuring it for Windows 2000) a few times to load in
MSDTC.
- Once Server1 is configured, fail over to Server 2, and do the same.
10) Install SQL Server
- You'll
want NEED to copy the contents of
your SQL Server 2000 Enterprise Installation CD onto the local harddrive for
Server 1. (Otherwise you'll most likely encounter a read error during the
middle of installation. Recovery at this point is just UGLY.)
- Make sure cluster resources are on Server1. (Otherwise your
installation won't get far...)
- Change the name of the Group 0 resource to SQL
Server Resource.
- Run the autorun.exe from your VM's hard drive. (i.e.
start the 'CD' up from your Hard Drive).
- From the splash screen select SQL Server 2000 Components,
and the pick Install Database Server.
- Walk your way through the wizard. You'll be creating a Virtual SQL
Server. And you'll need to provide a network name for it. SQLServer
is a good name if you're going the bland route, otherwise it has to either be
Marge, or Barney <g>. You'll also need to specify an IP Address as well.
When you do so, make sure you are binding the IP address to the
External NIC, not to the heartbeat NIC.
- Finish up the installation as needed (putting your database files on the R
Drive).
When Installation completes, you'll have a watered down, not-terribly-secured
BUT clustered SQL Server instance in a completely virtual environment. (You may
then want to set dependencies and other things on cluster resources in the SQL
Server Resource Group... check out the Patterns
and Practices Library for details (though remember this info is old and for
Windows 2000 - though it's not that different).