User Tools

Site Tools


progetti:cloud-areapd:ceph:slow_requests_debugging

slow requests debugging

ceph health detail

da` gli OSD problematici.

Per debuggare gli OSD problematici:

ceph daemon osd.<id> ops

da` la coda delle operazioni


ceph daemon osd.3 dump_blocked_ops

mostra le operazioni bloccate, es:

[root@ceph-osd-01 ~]# ceph daemon osd.3 dump_blocked_ops
{
    "ops": [],
    "complaint_time": 30.000000,
    "num_blocked_ops": 0
}
[root@ceph-osd-01 ~]#

ceph daemon osd.<id> dump_historic_ops

dice le operazioni piu` lente che ci sono state recentemente (ultimi 10 minuti ?)

Indica anche il client, la durata dell'operazione, il dato coinvolto, es:

            "description": "osd_op(client.173092581.0:166224247 8.c0 8:034d4294:::rbd_data.35f31a7ea1cd63.00000000000368c4:head [sparse-read 3305472~516096] snapc 0=[] ondisk+read+kno
wn_if_redirected e1198650)",
            "initiated_at": "2019-02-05 14:24:55.541494",
            "age": 13.102105,
            "duration": 0.609200,
            "type_data": {
                "flag_point": "started",
                "client_info": {
                    "client": "client.173092581",
                    "client_addr": "192.168.61.120:0/1047816369",
                    "tid": 166224247

In questo caso il client e` 192.168.61.120 (cld-np-10) e l'immagine e` rbd_data.35f31a7ea1cd63

Per vedere qual e`:

[root@ceph-mon-01 ~]#  for rbd in $(rbd ls -p volumes-prod); do rbd info volumes-prod/$rbd; done | grep -B3 -A2 rbd_data.35f31a7ea1cd63
rbd image 'volume-24664fe3-a84c-41cd-b403-94adedc4adf2':
	size 1000 GB in 256000 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.35f31a7ea1cd63
	format: 2
	features: layering

Quindi il volume cinder 24664fe3-a84c-41cd-b403-94adedc4adf2 che in effetti e` attaccato su una VM ospitata su cld-np-10:

[root@cld-ctrl-01 ~]# cinder list --all | grep 24664fe3-a84c-41cd-b403-94adedc4adf2 
| 24664fe3-a84c-41cd-b403-94adedc4adf2 | 1c587619a84f417eabc011321fd559ec | in-use         | data004                   | 1000  | ceph             | false    | dfa2175b-8cdb-464d-a216-068b0cd8fc26 |
[root@cld-ctrl-01 ~]# nova show dfa2175b-8cdb-464d-a216-068b0cd8fc26 | grep cld
| OS-EXT-SRV-ATTR:host                 | cld-np-10.cloud.pd.infn.it                                                                                                                                       |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | cld-np-10.cloud.pd.infn.it                                                                                                                                       |
| flavor                               | cldareapd.32cores64GB25GB (f30e59ba-b207-4d58-bc35-709208f7c6b9)                                                                                                 |
[root@cld-ctrl-01 ~]# 

If you run ceph daemon osd.<id> dump_historic_ops or ceph daemon osd.<id> dump_ops_in_flight, you will see a set of operations and a list of events each operation went through. These are briefly described below.

progetti/cloud-areapd/ceph/slow_requests_debugging.txt · Last modified: 2019/02/05 13:34 by sgaravat@infn.it

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki