Category Archive: clustering

May 17

Why is my pathing policy limited to “fixed” or “MRU” with things like MSCS cluster?

Yesterday I received an email from someone. He wanted to know why he was limited to using either the “fixed” or “MRU” pathing policy for the LUNs attached to his MSCS cluster. In his environment they used round-robin for everything and not being able to configure all of them with the same policy was against their internal policy. The thing is that if round-robin would be used and the path would switch (by default every 1000 I/Os) the SCSI-2 reservation would need to be re-acquired on this LUN. (MSCS uses SCSI-2 reservations for their cluster devices) As you can imagine that could cause a lot of stress on your array and could lead to all sorts of problems. So please do not ignore this recommendation! Some extra details can be found in the following KB articles:

Why is my pathing policy limited to “fixed” or “MRU” with things like MSCS cluster?” originally appeared on Yellow-Bricks.com. Follow us on Twitter and Facebook.
Available now: vSphere 5 Clustering Deepdive. (paper | e-book)

Permanent link to this article: http://www.startswithv.com/2012/05/17/why-is-my-pathing-policy-limited-to-fixed-or-mru-with-things-like-mscs-cluster/

Apr 25

What is das.maskCleanShutdownEnabled about?

I had a question today around what the vSphere HA option advanced setting das.maskCleanShutdownEnabled is about. I described why it was introduced for Stretched Clusters  but will give a short summary here:

Two advanced settings have been introduced in vSphere 5.0 Update 1 to enable HA to fail-over virtual machines which are located on datastores which are in a Permanent Device Loss state. This is very specific to stretchec cluster environments. The first setting is configured on a host level and is “disk.terminateVMOnPDLDefault”. This setting can be configured in /etc/vmware/settings and should be set to “True”. This setting ensures that a virtual machine is killed when the datastore it resides on is in a PDL state.

The second setting is a vSphere HA advanced setting called “das.maskCleanShutdownEnabled“. This setting is also not enabled by default and it will need to be set to “True”. This settings allows HA to trigger a restart response for a virtual machine which has been killed automatically due to a PDL condition. This setting allows HA to differentiate between a virtual machine which was killed due to the PDL state or a virtual machine which has been powered off by an administrator.

But why is “das.maskCleanShutdownEnabled” needed for HA? From a vSphere HA perspective there are two different types of “operations”. The first is a user initiated power-off (clean) and the other is a kill. When a virtual machine is powered off by a user part of the process is setting the property “runtime.cleanPowerOff” to true.

Remember that when “disk.terminateVMOnPDLDefault” is configured your VMs will be killed when they issue I/O. This is where the  problem arises, in a PDL scenario it is impossible to set “runtime.cleanPowerOff” as the datastore, and as such the vmx, is unreachable. As the property defaults to “true” vSphere HA will assume the VMs were cleanly powered off. This would result in vSphere HA not taking any action in a PDL scenario. By setting ”das.maskCleanShutdownEnabled” to true, a scenario where all VMs are killed but never restarted can be avoided.

If you have a stretched cluster environment, make sure to configure these settings accordingly!

What is das.maskCleanShutdownEnabled about?” originally appeared on Yellow-Bricks.com. Follow us on Twitter and Facebook.
Available now: vSphere 5 Clustering Deepdive. (paper | e-book)

Permanent link to this article: http://www.startswithv.com/2012/04/25/what-is-das-maskcleanshutdownenabled-about/