SRM Tutorial Part 9: Advanced configuration & troubleshooting

This is the last part of my SRM tutorial. Finally I will show you some advanced configuration in addition to the basic setup & configuration of the last steps. Also I will go through some common errors in the troubleshooting part on the end of this post.


SRM Tutorial Part 1: Lab setup
SRM Tutorial Part 2: Components & design
SRM Tutorial Part 3: SRM installation
SRM Tutorial Part 4: NetApp Ontap Simulator – Setup & configuration
SRM Tutorial Part 5: Configure NetApp SnapMirror
SRM Tutorial Part 6: SRA installation
SRM Tutorial Part 7: Configuration #1
SRM Tutorial Part 8: Configuration #2
SRM Tutorial Part 9: Advanced configuration & troubleshooting

The most used modification of a recovery plan is to set priorities to individual VMs. By default all VMs are set to the priority level 3, which means they are powered on at the same time. Usually you have some VMs you want to power on before all the other like AD, DNS, Firewall etc. You would just sort this kind of VMs in one of the higher priority groups. SRM will power on all these VMs at first and goes over to the next priority group after all these VMs came up (VMware Tools are used as an indicator) or a timeout value is reached (more about this later).
Another use case of the priorities is to avoid boot storms, which you definitely should consider in large environments. The total time your recovery plan needs to finish can be shorter if you split your VMs into multiple priority groups instead of powering on hundreds of VMs at the same time.

01_srm_vm_priorities

Maybe you have also some VMs which you don´t want to powered on automatically by SRM. Set the Startup Action to “Do not power on” and you power on your VM by yourself when you think it’s the right time.

02_srm_vm_startup

There are a lot more options for each individual VM available. Right-click the desired VM and select Configure.

03_srm_individual_vm_config

A great feature is that you can assign a new network configuration to your VMs. Hopefully you have a stretched VLAN between your sites and do not need these options because a reassignment of IPs leads also to some other (not SRM related) challenges like updating DNS etc. If you need to assign new addresses to all your VMs in a huge environment you should definitely take a look at this VMware blog post and use a CSV file you can import via CLI.

05_srm_ip_vm_config

Above I showed you the ability of putting your VMs into priority groups to have a structured boot order. Additional you can configure dependencies between two VMs, which are in the same priority group. This is really useful for large environment, where the six priority groups are not sufficient.

06_srm_vm_dependencies

By default all VMs are monitored to response via the VMware Tools heartbeat as a successful power on process. May you have some VMs in your environment without VMware Tools running? If that´s the case you should uncheck the Wait for VMware Tools box. Alternatively you can set a delay value; otherwise the power on process is immediately seen as successful.
You should also check the power off options in case you don´t have VMware tools running on a VM. By default SRM want to initiate a clean guest OS shutdown. Without VMware tools you need to set the shutdown action to power off.

07_srm_vm_vmwaretools

Another nice option is to suspend VMs, which are running on the secondary site to get resources free. This gives you the ability to use your resources at your DR site e.g. for a test environment etc. without blocking any system resources in case of a failover with SRM. On most steps in the recovery plan you will find the option “Add Non-Critical VM” to do this.

08_srm_vm_suspend

SRM bring a lot of alarms you can configure individual. I strongly recommend you to take a look on these.

09_srm_alarms

Troubleshooting:

Above I show you some errors which I have seen in SRM and some troubleshooting tips.

Communication error between SRA & storage array:
A common error is that the SRA adapter is unable to communicate with the storage array. You will get errors like Unable to add array manager or SRA command ‘discoverArrays’ failed.

error1

Even if it sounds like from Microsoft 🙂 : Reboot your SRM server. Typically you haven´t done a reboot between the SRA installation and the SRM configuration. Especially in relationship with the EMC VNX adapter I have seen this error quite every time and a reboot was always the solution.
Also with the NetApp Simulator I have seen this error if the “options httpd.admin“-settings haven´t been set. Take a look at Part 4: NetApp Ontap Simulator – Setup & configuration for these settings.

Enable of the array pair not possible

I have seen the error SRA command ‘discoverDevices’ failed if the SRM server didn´t have the correct permission for the replicated NFS datastores.
error2







Trackbacks

  1. […] 6: SRA installation SRM Tutorial Part 7: Configuration #1 SRM Tutorial Part 8: Configuration #2 SRM Tutorial Part 9: Advanced configuration & troubleshooting I start again with a diagram, which shows all relevant logical components of the vSphere and SRM […]

  2. […] 6: SRA installation SRM Tutorial Part 7: Configuration #1 SRM Tutorial Part 8: Configuration #2 SRM Tutorial Part 9: Advanced configuration & troubleshooting Note: The following steps also apply if you are just looking for how to setup the ONTAP Simulator […]

  3. […] 6: SRA installation SRM Tutorial Part 7: Configuration #1 SRM Tutorial Part 8: Configuration #2 SRM Tutorial Part 9: Advanced configuration & troubleshooting You can download the SRA from the VMware SRM download section or on your storage vendor’s […]

Speak Your Mind

*