User Tools

Site Tools


Sidebar

progetti:cloud-areapd:ceph:slow_requests_debugging

slow requests debugging

ceph health detail

da` gli OSD problematici. Per debuggare gli OSD problematici: <code bash> ceph daemon osd.<id> ops </code> da` la coda delle operazioni


ceph daemon osd.3 dump_blocked_ops

mostra le operazioni bloccate, es:

[root@ceph-osd-01 ~]# ceph daemon osd.3 dump_blocked_ops
{
    "ops": [],
    "complaint_time": 30.000000,
    "num_blocked_ops": 0
}
[root@ceph-osd-01 ~]#

ceph daemon osd.<id> dump_historic_ops

dice le operazioni piu` lente che ci sono state recentemente (ultimi 10 minuti ?) Indica anche il client, la durata dell'operazione, il dato coinvolto, es: <code> "description": "osd_op(client.173092581.0:166224247 8.c0 8:034d4294:::rbd_data.35f31a7ea1cd63.00000000000368c4:head [sparse-read 3305472~516096] snapc 0=[] ondisk+read+kno wn_if_redirected e1198650)", "initiated_at": "2019-02-05 14:24:55.541494", "age": 13.102105, "duration": 0.609200, "type_data": { "flag_point": "started", "client_info": { "client": "client.173092581", "client_addr": "192.168.61.120:0/1047816369", "tid": 166224247 </code> In questo caso il client e` 192.168.61.120 (cld-np-10) e l'immagine e` rbd_data.35f31a7ea1cd63 Per vedere qual e`:

[root@ceph-mon-01 ~]#  for rbd in $(rbd ls -p volumes-prod); do rbd info volumes-prod/$rbd; done | grep -B3 -A2 rbd_data.35f31a7ea1cd63
rbd image 'volume-24664fe3-a84c-41cd-b403-94adedc4adf2':
	size 1000 GB in 256000 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.35f31a7ea1cd63
	format: 2
	features: layering

Quindi il volume cinder 24664fe3-a84c-41cd-b403-94adedc4adf2 che in effetti e` attaccato su una VM ospitata su cld-np-10:

[root@cld-ctrl-01 ~]# cinder list --all | grep 24664fe3-a84c-41cd-b403-94adedc4adf2 
| 24664fe3-a84c-41cd-b403-94adedc4adf2 | 1c587619a84f417eabc011321fd559ec | in-use         | data004                   | 1000  | ceph             | false    | dfa2175b-8cdb-464d-a216-068b0cd8fc26 |
[root@cld-ctrl-01 ~]# nova show dfa2175b-8cdb-464d-a216-068b0cd8fc26 | grep cld
| OS-EXT-SRV-ATTR:host                 | cld-np-10.cloud.pd.infn.it                                                                                                                                       |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | cld-np-10.cloud.pd.infn.it                                                                                                                                       |
| flavor                               | cldareapd.32cores64GB25GB (f30e59ba-b207-4d58-bc35-709208f7c6b9)                                                                                                 |
[root@cld-ctrl-01 ~]# 

If you run ceph daemon osd.<id> dump_historic_ops or ceph daemon osd.<id> dump_ops_in_flight, you will see a set of operations and a list of events each operation went through. These are briefly described below.

progetti/cloud-areapd/ceph/slow_requests_debugging.txt · Last modified: 2019/02/05 13:34 by sgaravat@infn.it