progetti:cloud-areapd:ceph:replace_a_osd
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| progetti:cloud-areapd:ceph:replace_a_osd [2019/05/13 10:36] – sgaravat@infn.it | progetti:cloud-areapd:ceph:replace_a_osd [2020/04/29 09:49] (current) – sgaravat@infn.it | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====== Replace a OSD ====== | ||
| + | ==== Replace a FileStore OSD ==== | ||
| + | |||
| + | Let's suppose that osd.27 (device /dev/sdj) is broken. | ||
| + | |||
| + | Let's verify that this is a filestore OSD: | ||
| + | <code bash> | ||
| + | # ceph osd metadata 27 | grep osd_ob | ||
| + | " | ||
| + | </ | ||
| + | |||
| + | |||
| + | |||
| + | Let's suppose that this OSD is using /dev/sdb3 for the journal: | ||
| + | |||
| + | <code bash> | ||
| + | [root@ceph-osd-03 ~]# ceph-disk list | grep osd.27 | ||
| + | / | ||
| + | </ | ||
| + | |||
| + | |||
| + | The following operations should be done to remove it from ceph: | ||
| + | |||
| + | <code bash> | ||
| + | ceph osd crush reweight osd.27 0 | ||
| + | </ | ||
| + | |||
| + | Wait that the status is HEALTH-OK. | ||
| + | Then: | ||
| + | |||
| + | < | ||
| + | ceph osd out osd.27 | ||
| + | ceph osd crush remove osd.27 | ||
| + | systemctl stop ceph-osd@27.service | ||
| + | ceph auth del osd.27 | ||
| + | ceph osd rm osd.27 | ||
| + | umount / | ||
| + | </ | ||
| + | |||
| + | Replace the disk. | ||
| + | |||
| + | Run the prepare command (make sure to specify the right partition of the SSD disk): | ||
| + | |||
| + | <code bash> | ||
| + | ceph osd set noout | ||
| + | |||
| + | ceph-disk prepare --zap --cluster ceph --cluster-uuid 8162f291-00b6-4b40-a8b4-1981a8c09b64 --filestore --fs-type xfs /dev/sdj /dev/sdb3 | ||
| + | </ | ||
| + | |||
| + | Then enable the start of OSD at boot time it: | ||
| + | |||
| + | <code bash> | ||
| + | ceph-disk activate /dev/sdj1 | ||
| + | </ | ||
| + | |||
| + | |||
| + | If everything is ok, re-enable the data balance: | ||
| + | |||
| + | <code bash> | ||
| + | ceph osd unset noout | ||
| + | </ | ||
| + | |||
| + | |||
| + | ==== Replace a BlueStore OSD ==== | ||
| + | |||
| + | Let's suppose that osd.14 (/dev/vdd) is broken. | ||
| + | |||
| + | Let's verify that this is a Bluestore OSD: | ||
| + | |||
| + | <code bash> | ||
| + | # ceph osd metadata 14 | grep osd_ob | ||
| + | " | ||
| + | </ | ||
| + | |||
| + | |||
| + | Let's find the relevant devices: | ||
| + | |||
| + | <code bash> | ||
| + | |||
| + | [root@c-osd-5 /]# ceph-bluestore-tool show-label --path / | ||
| + | infering bluefs devices from bluestore path | ||
| + | { | ||
| + | "/ | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | }, | ||
| + | "/ | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | } | ||
| + | } | ||
| + | </ | ||
| + | |||
| + | Let's find the volume groups used for the block and block.db: | ||
| + | < | ||
| + | |||
| + | [root@c-osd-5 /]# ls -l / | ||
| + | lrwxrwxrwx 1 ceph ceph 27 May 13 15:34 / | ||
| + | [root@c-osd-5 /]# ls -l / | ||
| + | lrwxrwxrwx 1 ceph ceph 24 May 13 15:34 / | ||
| + | [root@c-osd-5 /]# | ||
| + | </ | ||
| + | |||
| + | |||
| + | Let's verify that vdd is indeed the physical volume used for this OSD: | ||
| + | |||
| + | <code bash> | ||
| + | [root@c-osd-5 /]# vgdisplay -v ceph-block-14 | ||
| + | --- Volume group --- | ||
| + | VG Name | ||
| + | System ID | ||
| + | Format | ||
| + | Metadata Areas 1 | ||
| + | Metadata Sequence No 15 | ||
| + | VG Access | ||
| + | VG Status | ||
| + | MAX LV 0 | ||
| + | Cur LV 1 | ||
| + | Open LV 1 | ||
| + | Max PV 0 | ||
| + | Cur PV 1 | ||
| + | Act PV 1 | ||
| + | VG Size < | ||
| + | PE Size 4.00 MiB | ||
| + | Total PE 25599 | ||
| + | Alloc PE / Size 25599 / <100.00 GiB | ||
| + | Free PE / Size 0 / 0 | ||
| + | VG UUID | ||
| + | |||
| + | --- Logical volume --- | ||
| + | LV Path / | ||
| + | LV Name block-14 | ||
| + | VG Name ceph-block-14 | ||
| + | LV UUID hu4Xop-481K-BJyP-b473-PjEW-OQFT-oziYnc | ||
| + | LV Write Access | ||
| + | LV Creation host, time c-osd-5.novalocal, | ||
| + | LV Status | ||
| + | # open 4 | ||
| + | LV Size <100.00 GiB | ||
| + | Current LE 25599 | ||
| + | Segments | ||
| + | Allocation | ||
| + | Read ahead sectors | ||
| + | - currently set to 8192 | ||
| + | Block device | ||
| + | |||
| + | --- Physical volumes --- | ||
| + | PV Name / | ||
| + | PV UUID | ||
| + | PV Status | ||
| + | Total PE / Free PE 25599 / 0 | ||
| + | |||
| + | [root@c-osd-5 /]# | ||
| + | </ | ||
| + | |||
| + | |||
| + | The following operations should be done to remove it from ceph: | ||
| + | |||
| + | <code bash> | ||
| + | ceph osd crush reweight osd.14 0 | ||
| + | </ | ||
| + | |||
| + | This will trigger a data movement from that OSD (ceph status will report many objects misplaced) | ||
| + | |||
| + | |||
| + | Wait until there are no more objects misplaced | ||
| + | |||
| + | Then: | ||
| + | |||
| + | <code bash> | ||
| + | ceph osd out osd.14 | ||
| + | </ | ||
| + | |||
| + | Verifichiamo che si possa " | ||
| + | |||
| + | <code bash> | ||
| + | [root@ceph-mon-01 ~]# ceph osd safe-to-destroy 14 | ||
| + | OSD(s) 14 are safe to destroy without reducing data durability. | ||
| + | </ | ||
| + | |||
| + | |||
| + | |||
| + | <code bash> | ||
| + | [root@ceph-osd-02 ~]# systemctl kill ceph-osd@14 | ||
| + | [root@ceph-osd-02 ~]# ceph osd destroy 14 --yes-i-really-mean-it | ||
| + | [root@ceph-osd-02 ~]# umount / | ||
| + | </ | ||
| + | |||
| + | |||
| + | |||
| + | Cancelliamo il volume group: | ||
| + | |||
| + | <code bash> | ||
| + | [root@c-osd-5 /]# vgremove ceph-block-14 | ||
| + | Do you really want to remove volume group " | ||
| + | Do you really want to remove active logical volume ceph-block-14/ | ||
| + | Logical volume " | ||
| + | Volume group " | ||
| + | [root@c-osd-5 /]# | ||
| + | </ | ||
| + | |||
| + | Sostituiamo il disco. Supponiamo che quello nuovo si chiami sempre vdd. | ||
| + | |||
| + | Ricreo volume group e logical volume: | ||
| + | |||
| + | <code bash> | ||
| + | [root@c-osd-5 /]# vgcreate ceph-block-14 /dev/vdd | ||
| + | Physical volume "/ | ||
| + | Volume group " | ||
| + | [root@c-osd-5 /]# lvcreate -l 100%FREE -n block-14 ceph-block-14 | ||
| + | Logical volume " | ||
| + | [root@c-osd-5 /]# | ||
| + | </ | ||
| + | |||
| + | Alla fine ricreiamo l'OSD: | ||
| + | |||
| + | <code bash> | ||
| + | ceph osd set norebalance | ||
| + | ceph osd set nobackfill | ||
| + | |||
| + | |||
| + | [root@c-osd-5 /]# ceph-volume lvm create --bluestore --data ceph-block-14/ | ||
| + | </ | ||
| + | |||
| + | Dopo un po`, quando non ci sono piu` pg in peering: | ||
| + | |||
| + | <code bash> | ||
| + | ceph osd crush reweight osd.14 5.45609 | ||
| + | |||
| + | ceph osd unset nobackfill | ||
| + | ceph osd unset norebalance | ||
| + | </ | ||
