Joe4Tech
2009-12-14 17:35:01 UTC
We have discovered the root trigger of an issue that others may want to know
about. The default settings of the 6.0.667.0, Windows Server 2008 Operating
System (Monitoring), Performance, Windows Server 2008 Logical Disk, checks
the fragmentation levels of all logical disks on a periodic basis (Every
Saturday at 3 a.m. by default). This behavior has caused BSOD, mini dump
BUGCHECK_STR: 0x9E on Windows 2008 R2 Data Center servers, Hyper-V Hosts
with ISCSI Cluster Shared Volume (CSV). The sequence appears to be that the
monitoring of the management pack triggers Event ID: 7036, the Disk
Defragmenter service entered the running state, on both Hyper-V Hosts with
the Cluster Shared Volume (CSV) Saturday morning at nearly the same time.
One of the Hyper-V Host nodes apparently is in control of the CSV and the
other node reports Event ID: 5120 (see below). Sometime later the node that
has control of the CSV reports Event ID: 1230 and after about 20-30 minutes
crashes and reboots.
Log Name: System
Source: Microsoft-Windows-FailoverClustering
Event ID: 5120
Task Category: Cluster Shared Volume
Level: Error
Keywords:
User: SYSTEM
Computer: Hyper-VHost01
Description:
Cluster Shared Volume 'Volume1' ('Cluster Disk 1') is no longer available on
this node because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will
temporarily be queued until a path to the volume is reestablished.
Log Name: System
Source: Microsoft-Windows-FailoverClustering
Event ID: 1230
Task Category: Resource Control Manager
Level: Error
Keywords:
User: SYSTEM
Computer: Hyper-VHost02
Description:
Cluster resource 'Cluster Disk 1' (resource type '', DLL 'clusres.dll')
either crashed or deadlocked. The Resource Hosting Subsystem (RHS) process
will now attempt to terminate, and the resource will be marked to run in a
separate monitor.
At this point we have created an override with a group containing the
Hyper-V Host and set it to disable the monitor Logical Disk Fragmentation
Level for the group.
Reference: Why is my 2008 Failover Clustering node blue screening with a
Stop 0x0000009E
http://blogs.technet.com/askcore/archive/2009/06/12/why-is-my-2008-failover-clustering-node-blue-screening-with-a-stop-0x0000009e.aspx
----------------
This post is a suggestion for Microsoft, and Microsoft responds to the
suggestions with the most votes. To vote for this suggestion, click the "I
Agree" button in the message pane. If you do not see the button, follow this
link to open the suggestion in the Microsoft Web-based Newsreader and then
click "I Agree" in the message pane.
http://www.microsoft.com/communities/newsgroups/list/en-us/default.aspx?mid=51d2688f-e7d9-4c40-99a5-b25e5e6704e3&dg=microsoft.public.opsmgr.managementpacks
about. The default settings of the 6.0.667.0, Windows Server 2008 Operating
System (Monitoring), Performance, Windows Server 2008 Logical Disk, checks
the fragmentation levels of all logical disks on a periodic basis (Every
Saturday at 3 a.m. by default). This behavior has caused BSOD, mini dump
BUGCHECK_STR: 0x9E on Windows 2008 R2 Data Center servers, Hyper-V Hosts
with ISCSI Cluster Shared Volume (CSV). The sequence appears to be that the
monitoring of the management pack triggers Event ID: 7036, the Disk
Defragmenter service entered the running state, on both Hyper-V Hosts with
the Cluster Shared Volume (CSV) Saturday morning at nearly the same time.
One of the Hyper-V Host nodes apparently is in control of the CSV and the
other node reports Event ID: 5120 (see below). Sometime later the node that
has control of the CSV reports Event ID: 1230 and after about 20-30 minutes
crashes and reboots.
Log Name: System
Source: Microsoft-Windows-FailoverClustering
Event ID: 5120
Task Category: Cluster Shared Volume
Level: Error
Keywords:
User: SYSTEM
Computer: Hyper-VHost01
Description:
Cluster Shared Volume 'Volume1' ('Cluster Disk 1') is no longer available on
this node because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will
temporarily be queued until a path to the volume is reestablished.
Log Name: System
Source: Microsoft-Windows-FailoverClustering
Event ID: 1230
Task Category: Resource Control Manager
Level: Error
Keywords:
User: SYSTEM
Computer: Hyper-VHost02
Description:
Cluster resource 'Cluster Disk 1' (resource type '', DLL 'clusres.dll')
either crashed or deadlocked. The Resource Hosting Subsystem (RHS) process
will now attempt to terminate, and the resource will be marked to run in a
separate monitor.
At this point we have created an override with a group containing the
Hyper-V Host and set it to disable the monitor Logical Disk Fragmentation
Level for the group.
Reference: Why is my 2008 Failover Clustering node blue screening with a
Stop 0x0000009E
http://blogs.technet.com/askcore/archive/2009/06/12/why-is-my-2008-failover-clustering-node-blue-screening-with-a-stop-0x0000009e.aspx
----------------
This post is a suggestion for Microsoft, and Microsoft responds to the
suggestions with the most votes. To vote for this suggestion, click the "I
Agree" button in the message pane. If you do not see the button, follow this
link to open the suggestion in the Microsoft Web-based Newsreader and then
click "I Agree" in the message pane.
http://www.microsoft.com/communities/newsgroups/list/en-us/default.aspx?mid=51d2688f-e7d9-4c40-99a5-b25e5e6704e3&dg=microsoft.public.opsmgr.managementpacks