解決刪除Volume報錯的問題(二)

刪除Volume又遇到新的錯誤,日誌開到debug後看到以下的內容。

 Clear capabilities
 volume volume-4e1817be-9b8c-4834-ad90-baf24ef61775: removing export delete_volume /usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/manager.py:192
 Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --show execute /usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/utils.py:167
 Removing volume: 4e1817be-9b8c-4834-ad90-baf24ef61775
 Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --force --delete iqn.2010-10.org.openstack:volume-4e1817be-9b8c-4834-ad90-baf24ef61775 execute /usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/utils.py:167
 Result was 22 execute /usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/utils.py:184
 [-] Exception during message handling
 Traceback (most recent call last):
   File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/openstack/common/rpc/amqp.py", line 276, in _process_data
     rval = self.proxy.dispatch(ctxt, version, method, **args)
   File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/openstack/common/rpc/dispatcher.py", line 145, in dispatch
     return getattr(proxyobj, method)(ctxt, **kwargs)
   File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/manager.py", line 206, in delete_volume
     {'status': 'error_deleting'})
   File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
     self.gen.next()
   File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/manager.py", line 193, in delete_volume
     self.driver.remove_export(context, volume_ref)
   File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/driver.py", line 474, in remove_export
     self.tgtadm.remove_iscsi_target(iscsi_target, 0, volume['id'])
   File "/usr/lib/python2.6/site-packages/cinder-2012.2.5-py2.6.egg/cinder/volume/iscsi.py", line 180, in remove_iscsi_target
     "id:%(volume_id)s.") % locals())
 KeyError: u'volume_id' 

雖然最後報的是KeyError的錯但實際還是在調用tgt-admin --force --delete <value>出錯了,通過tgt-admin -s查看存儲節點上的target看到無法刪除的target存在鏈接異常, 類似如下所示,而實際上客戶端並不存在這些連接,自然也就無法前面據說的通過客戶端來釋放了。

Target 28: iqn.2010-10.org.openstack:volume-4b7ee394-0028-4d87-baeb-c0ef4ec134e5
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
        I_T nexus: 121
            Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
            Connection: 0
                IP Address: 10.61.2.9
        I_T nexus: 138
            Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
            Connection: 0
                IP Address: 10.61.2.9
        I_T nexus: 140
            Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
            Connection: 0
                IP Address: 10.61.2.9
        I_T nexus: 143
            Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
            Connection: 0
                IP Address: 10.61.2.9
        I_T nexus: 147
            Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
            Connection: 0
                IP Address: 10.61.2.9
        I_T nexus: 150
            Initiator: iqn.1994-05.com.redhat:a3ead9b89ee0
            Connection: 0
                IP Address: 10.61.2.9
    LUN information:
    ......

最後我的解決辦法是重啓tgtd服務,當然正常的service tgtd restart命令是無法重啓的因爲鏈接沒釋放,只能先查到tgtd的pid,然後kill掉並刪除鎖文件最後重新啓動tgtd。

[root@store2 ~]# ps aux | grep tgtd
root      2652  0.1  0.2 888048 40820 ?        Ssl  Apr27 104:54 tgtd
root      2653  0.0  0.0  12760   484 ?        S    Apr27   0:40 tgtd
root      9643  0.0  0.0 103244   872 pts/1    S+   15:41   0:00 grep tgtd
[root@store2 ~]# kill -9 2652
[root@store2 ~]# kill -9 2653
-bash: kill: (2653) - 沒有那個進程
[root@store2 ~]# service tgtd status
tgtd 已死,但是 subsys 被鎖
[root@store2 ~]# rm -f /var/lock/subsys/tgtd
[root@store2 ~]# service tgtd start
正在啓動 SCSI target daemon:                              [確定]
[root@store2 ~]# service tgtd status
tgtd (pid 9675 9674) 正在運行...
[root@store2 ~]# service cinder-volume restart

重啓後所有target的鏈接狀態就正常了,然後通過前面所說的重置數據庫狀態後,就可以正常刪除了。

北方工業大學 | 雲計算研究中心 | 姜永

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章