User Tools

Site Tools


progetti:cloud-areapd:ced-c:operations:resolve_iscsi_alarms_due_to_missing_target

Resolve iSCSI alarms due to missing target

Problem description

It might happen that a compute node tries to access an iSCSI target that has disappeared for some reason. The cinder driver is still working so the problem is at a lower level, e.g. doesn't impact openstack operations. In this case the iSCSI device logs many alarms, flooding the admins with mail messages containing something like:

http://192.168.40.100/

-----------------------------------------
ERROR event from storage array iSCSIUnipdA
subsystem: MgmtExec
    event: 7.4.3
     time: Mon Nov 28 17:56:19 2016

iSCSI login to target '192.168.40.100:3260, iqn.2001-05.com.equallogic:0-fe83b6-aad7200c3-c4d00647e67568e2-volume-5bfac1ad-f884-4f18-ac40-f0423a5af4c1' from initiator '192.168.40.164:52285, iqn.1994-05.com.redhat:881385dab36e' failed for the following reason:
	Requested target not found.
-----------------------------------------
ERROR event from storage array iSCSIUnipdA
subsystem: MgmtExec
    event: 7.4.3
     time: Mon Nov 28 17:56:19 2016

iSCSI login to target '192.168.40.100:3260, iqn.2001-05.com.equallogic:0-fe83b6-b097200c3-25200647e73568e2-volume-191a6a71-ea05-44af-86ce-51e895612d72' from initiator '192.168.40.164:52286, iqn.1994-05.com.redhat:881385dab36e' failed for the following reason:
	Requested target not found.
...
...

Recovery procedure

  • login to the affected node; on the above example it will be 192.168.40.164 → cld-blu-15
  • list the iscsi volumes effectively used by the node with
  # iscsiadm -m session | cut -d' ' -f4 | sort

  iqn.2001-05.com.equallogic:0-fe83b6-0a2c4c0c5-fa7bc3e56da5a983-volume-43b1aba0-1020-4dc6-a205-ab4d33fed99f
  iqn.2001-05.com.equallogic:0-fe83b6-a857200c3-8a900647e61568e2-volume-e737884c-b6f7-486d-9c2e-ece3056d271f
  iqn.2001-05.com.equallogic:0-fe83b6-b2dc4c0c5-918bc32f6d158b6c-volume-3e82eca8-8535-436b-be57-7f37d68d142d
  iqn.2001-05.com.equallogic:0-fe83b6-ea5c4c0c5-d64bc3353dc58ca5-volume-f0f32dcd-acf7-4932-be0d-d1f06cd487c2
  • compare this list with the one generated by
  # iscsiadm -m discovery -t st -p 192.168.40.100 | cut -d' ' -f2 | sort
  
  iqn.2001-05.com.equallogic:0-fe83b6-0a2c4c0c5-fa7bc3e56da5a983-volume-43b1aba0-1020-4dc6-a205-ab4d33fed99f
  iqn.2001-05.com.equallogic:0-fe83b6-b2dc4c0c5-918bc32f6d158b6c-volume-3e82eca8-8535-436b-be57-7f37d68d142d
  iqn.2001-05.com.equallogic:0-fe83b6-ea5c4c0c5-d64bc3353dc58ca5-volume-f0f32dcd-acf7-4932-be0d-d1f06cd487c2
  • if there are entries on the first list not showing on the second (like …486d-9c2e-ece3056d271f in this case) this means that the node thinks it's connect to a nonexistent target. In this case issue a logout from the target with
    # iscsiadm -m node --target iqn.2001-05.com.equallogic:0-THE-WRONG-TARGET -p 192.168.40.100 --logout
  • if you see no differences between the two lists, this means that iscsiadm got confused. You have to connect to the iscsi device and list the volumes from there:
# ssh grpadmin@192.168.40.100

Last login: Wed Dec  7 10:34:48 2016 from 192.168.40.121 on tty??
 

           Welcome to Group Manager

        Copyright 2001-2016 Dell Inc.



CloudUnipdVeneto> show volume
Name            Size       Snapshots Status  Permission Connections T
--------------- ---------- --------- ------- ---------- ----------- -
volume-1fb65656 500GB      2         online  read-write 1           Y 
  -f358-4ab8-8b                                                       
  aa-fb0c48a0e1                                                       
  8d                                                                  
volume-86a39713 16GB       2         online  read-write 1           Y 
  -50ba-4885-8f                                                       
  dc-9d7cdf3d16                                                       
  a8                                                                  
volume-c930e59f 30GB       2         online  read-write 1           Y 
  -35b4-4625-a1                                                       
  4a-af2ef5792f                                                       
  5a                                                                  
volume-c85996dd 30GB       2         online  read-write 2           Y 
  -b3b7-43d8-ae                                                       
  35-f15185734b                                                       
  6e                                                                  
volume-e737884c 8GB        2         online  read-write 1           Y 
  -b6f7-486d-9c                                                       
  2e-ece3056d27                                                       
  1f                                                                  
volume-f276ae66 4GB        2         online  read-write 0           Y 
  -404f-4b0a-99                                                       
  83-75eb87e7e2                                                       
  dd                                                                  
volume-dab04b6a 5GB        2         online  read-write 1           Y 
  -f7fe-485a-b7                                                       
  8e-f0927477b1                                                       
  5f                                                                  
volume-3b8cbb62 20GB       2         online  read-write 1           Y 
  -a6d9-4a6c-a1                                                       
  db-5613438455                                                       
  59 
...
...
  • after all the missing volumes are sorted out restart (just in case…) the openstack-cinder-volume services on cld-blu-01 and cld-blu-02
progetti/cloud-areapd/ced-c/operations/resolve_iscsi_alarms_due_to_missing_target.txt · Last modified: 2018/06/05 08:04 by mazzon@infn.it

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki