VMpros.nl
new1234.jpg
Home > VMware > VMware: SCSI reservation conflicts

VMware: SCSI reservation conflicts

Last week I had some trouble by a customer, 4 of my 8 datastores aren’t visible/accessible on the 6 ESX 3.5u2 hosts connected to a (FC) HP MSA1500.  Some datastores become unavailable and some were not affected. Numerous VMs were down, some of those with warning messages like Orphaned and Inaccessible.

Oke.. let’s troubleshoot:

 

Checking for active paths: esxcfg-mpath -l | grep -i active

FC 6:2.1 210100e08bb27a58<->500508b30091aac9 vmhba1:0:2 On active preferred
FC 6:2.1 210100e08bb27a58<->500508b30091aac9 vmhba1:0:6 On active preferred
Local 70:0.0 vmhba2:0:0 On active preferred
FC 6:2.1 210100e08bb27a58<->500508b30091aac9 vmhba1:0:1 On active preferred
FC 6:2.1 210100e08bb27a58<->500508b30091aac9 vmhba1:0:5 On active preferred
FC 6:2.1 210100e08bb27a58<->500508b30091aac9 vmhba1:0:3 On active preferred
FC 6:2.1 210100e08bb27a58<->500508b30091aac9 vmhba1:0:7 On active preferred
FC 6:2.1 210100e08bb27a58<->500508b30091aac9 vmhba1:0:0 On active preferred
FC 6:2.1 210100e08bb27a58<->500508b30091aac9 vmhba1:0:4 On active preferred

All online!

Checking for death paths: esxcfg-mpath -l | grep -i death

[root@esxmeri01 /]#

None death paths!

 

Checking my HBA’s connected to the SAN: cd /vmfs/devices/disks and listls vmh*

vmhba1:0:1:0  vmhba1:0:2:1  vmhba1:0:4:0  vmhba1:0:5:1  vmhba1:0:7:0  vmhba2:0:0:1   vmhba2:0:0:3  vmhba2:0:0:6  vmhba2:0:0:9
vmhba1:0:1:1  vmhba1:0:3:0  vmhba1:0:4:1  vmhba1:0:6:0  vmhba1:0:7:1  vmhba2:0:0:10  vmhba2:0:0:4  vmhba2:0:0:7
vmhba1:0:2:0  vmhba1:0:3:1  vmhba1:0:5:0  vmhba1:0:6:1  vmhba2:0:0:0  vmhba2:0:0:2   vmhba2:0:0:5  vmhba2:0:0:8

All online!

 

Represent LUN’s from HP ACU to ESX hosts:

I unpresent and represented the LUN’s in the HP ACU to the hosts, did a rescan but no still no success

Reservation conflicts??

After some troubleshooting and trying to get my datastores online I found some information to point me in the right direction:

cd /var/log
# cat dmesg

 image

resize_dma_pool: unknown device type 12
scsi2 (0,0,1) : RESERVATION CONFLICT
scsi2 (0,0,1) : RESERVATION CONFLICT
scsi2 (0,0,1) : RESERVATION CONFLICT
scsi2 (0,0,1) : RESERVATION CONFLICT
scsi2 (0,0,1) : RESERVATION CONFLICT
scsi2 (0,0,1) : RESERVATION CONFLICT
VMWARE: Device that would have been attached as scsi disk sda at scsi2, channel 0, id 0, lun 1
Has not been attached because this path could not complete a READ command eventhough a TUR worked.
result = 0×18 key = 0×0, asc = 0×0, ascq = 0×0
VMWARE: Device that would have been attached as scsi disk sda at scsi2, channel 0, id 0, lun 1
Has not been attached because it is a duplicate path or on a passive path
resize_dma_pool: unknown device type 12
VMWARE SCSI Id: Supported VPD pages for sda : 0×0 0×80 0×83 0xc0 0xb0 0xc1
VMWARE SCSI Id: Device id info for sda: 0×1 0×3 0×0 0×10 0×60 0×5 0×8 0xb3 0×0 0×93 0×89 0xc0 0×92 0×5 0xd2 0xd2 0x6d 0×13 0×0 0×15 0×2 0×3 0×0 0×20 0×36 0×30 0×30 0×35 0×30 0×38 0×42 0×33 0×30 0×30 0×39 0×33 0×38 0×39 0×43 0×30 0×39 0×32 0×30 0×35 0×44 0×32 0×44 0×32 0×36 0×44 0×31 0×33 0×30 0×30 0×31 0×35 0×1 0×6 0×0 0×4 0×0 0×0 0×0 0×2 0×1 0×14 0×0 0×4 0×0 0×0 0×0 0×1 0×1 0×15 0×0 0×4 0×0 0×0 0×0 0×1
VMWARE SCSI Id: Id for sda 0×60 0×05 0×08 0xb3 0×00 0×93 0×89 0xc0 0×92 0×05 0xd2 0xd2 0x6d 0×13 0×00 0×15 0x4d 0×53 0×41 0×20 0×56 0x4f
VMWARE: Unique Device attached as scsi disk sda at scsi2, channel 0, id 0, lun 2
Attached scsi disk sda at scsi2, channel 0, id 0, lun 2
resize_dma_pool: unknown device type 12
scsi2 (0,0,3) : RESERVATION CONFLICT
scsi2 (0,0,3) : RESERVATION CONFLICT
scsi2 (0,0,3) : RESERVATION CONFLICT
scsi2 (0,0,3) : RESERVATION CONFLICT
scsi2 (0,0,3) : RESERVATION CONFLICT
scsi2 (0,0,3) : RESERVATION CONFLICT
VMWARE: Device that would have been attached as scsi disk sdb at scsi2, channel 0, id 0, lun 3
Has not been attached because this path could not complete a READ command eventhough a TUR worked.

Checking the VMkering log file: tail -f /var/log/vmkernel

[Dec 21 10:44:25 esxmeri01 vmkernel: 0:19:06:51.306 cpu3:1037)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts
Dec 21 10:44:25 esxmeri01 vmkernel: 0:19:06:51.306 cpu3:1037)WARNING: SCSI: 255: status SCSI reservation conflict for vml.0200030000600508b3009389c079c769f1112400164d534120564f. residual R 919, CR 0, ER 3
Dec 21 10:44:26 esxmeri01 vmkernel: 0:19:06:52.243 cpu3:1037)SCSI: vm 1037: 109: Sync CR at 64
Dec 21 10:44:27 esxmeri01 vmkernel: 0:19:06:53.207 cpu3:1037)SCSI: vm 1037: 109: Sync CR at 48
Dec 21 10:44:28 esxmeri01 vmkernel: 0:19:06:54.124 cpu3:1037)SCSI: vm 1037: 109: Sync CR at 32
Dec 21 10:44:29 esxmeri01 vmkernel: 0:19:06:55.118 cpu3:1037)SCSI: vm 1037: 109: Sync CR at 16
Dec 21 10:44:30 esxmeri01 vmkernel: 0:19:06:56.098 cpu3:1037)SCSI: vm 1037: 109: Sync CR at 0
Dec 21 10:44:30 esxmeri01 vmkernel: 0:19:06:56.098 cpu3:1037)WARNING: SCSI: 119: Failing I/O due to too many reservation conflicts

In this case, LUN 12 is inaccessible in the ESX host cluster. Since it is listed in the output of step one, it was accessible at some point (a host reserved the LUN and never released it, possibly due to a SAN switch reboot in the middle of the reservation operation)

 

Solution:

Check pending reservations: esxcfg-info | egrep -B5 "s Reserved|Pending"

                           |—-Console Device…………………./dev/sda
                           |—-Devfs Path……………………../vmfs/devices/disks/vml.0200020000600508b3009389c09205d2d26d1300154d534120564f
                           |—-SCSI Level……………………..5
                           |—-Queue Depth…………………….32
                           |—-Is Pseudo………………………false
                           |—-Is Reserved…………………….false
                           |—-Pending Reservations…………….0

                           |—-Console Device…………………./dev/sdc
                           |—-Devfs Path……………………../vmfs/devices/disks/vml.0200060000600508b3009389c0df3c8fe88b3e001c4d534120564f
                           |—-SCSI Level……………………..5
                           |—-Queue Depth…………………….32
                           |—-Is Pseudo………………………false
                           |—-Is Reserved…………………….false
                           |—-Pending Reservations…………….1

Okay, as you can see disk “vml.0200060000600508b3009389c0df3c8fe88b3e001c4d534120564f “ has an Pending Reservation with status: 1.. you can solve this issue to do a LUN (connection) reset. You can do this on the fly, without losing any connection or data loss.

 

Now it’s time to do the LUN reset :

vmkfstools –lock lunreset /vmfs/devices/disks/vml.0200060000600508b3009389c0df3c8fe88b3e001c4d534120564f

 

I did a rescan on my HBA (each server) and voila… the datastores are back online again:

image

 

More information: VMware KB

  1. January 11th, 2011  (Quote) at 00:10  (Quote) | #1

    I just added your blog site to my blogroll, I pray you would give some thought to doing the same.

  2. January 19th, 2011  (Quote) at 05:21  (Quote) | #2

    Hi,

    i have small question.
    1.) what is the difference between # esxcfg-info -s | egrep -B16 “s Reserved|Pending” and esxcfg-info -s | grep -i -B 30 Pending(insert a backslash and then a space here)Reservations………………………….1

    2.) what does ‘B 30 or B16′ means?

    Note:- we have VMware ESX 4.0.
    thnx in advance.

    Valerian Crasto.

  3. avatar
    Peter Andre
    January 19th, 2011  (Quote) at 05:30  (Quote) | #3

    Jiiiihaaa this solved my problem, 100x many thanks

  1. at | #1
  2. at | #2
  3. at | #3
  4. at | #4