Add, remove, or reconfigure Ceph OSDs#
Pelagia Lifecycle Management (LCM) Controller simplifies Ceph cluster management by automating LCM operations. This section describes how to add, remove, or reconfigure Ceph OSDs.
Add a Ceph OSD #
-
Manually prepare the required devices on the existing node.
-
Optional. If you want to add a Ceph OSD on top of a raw device that already exists on a node or is hot-plugged, add the required device using the following guidelines:
- You can add a raw device to a node during node deployment.
- If a node supports adding devices without a node reboot, you can hot plug a raw device to a node.
- If a node does not support adding devices without a node reboot, you can hot plug a raw device during node shutdown.
-
Open the
CephDeploymentcustom resource (CR) for editing:kubectl -n pelagia edit cephdpl -
In one of the following sections, specify parameters for Ceph OSD:
nodes.<nodeName>.devicesnodes.<nodeName>.deviceFilternodes.<nodeName>.devicePathFilter
For description of parameters, see CephDeployment: Nodes parameters.
The example configuration of the
nodessection with the new node:nodes: - name: storage-worker-52 roles: - mon - mgr devices: - config: # existing item deviceClass: hdd fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS - config: # new item deviceClass: hdd fullPath: /dev/disk/by-id/scsi-0ATA_HGST_HUS724040AL_PN1334PEHN1VBCWarning
We highly recommend using the non-wwn
by-idsymlinks to specify storage devices in thedeviceslist. For details, see Architecture: Addressing Ceph devices. -
Verify that the Ceph OSD on the specified node is successfully deployed. The
CephDeploymentHealthCRstatus.healthReport.cephDaemons.cephDaemonssection should not contain any issues.kubectl -n pelagia get cephdeploymenthealth -o yamlFor example:
status: healthReport: cephDaemons: cephDaemons: osd: info: - 3 osds, 3 up, 3 in status: ok -
Verify the desired Ceph OSD pod is
Running:kubectl -n rook-ceph get pod -l app=rook-ceph-osd -o wide | grep <nodeName>
Remove a Ceph OSD #
Note
Ceph OSD removal presupposes usage of a CephOsdRemoveTask CR. For workflow overview, see
High-level workflow of Ceph OSD or node removal.
Warning
When using the non-recommended Ceph pools replicated.size of
less than 3, Ceph OSD removal cannot be performed. The minimal replica
size equals a rounded up half of the specified replicated.size.
For example, if replicated.size is 2, the minimal replica size is
1, and if replicated.size is 3, then the minimal replica size
is 2. The replica size of 1 allows Ceph having PGs with only one
Ceph OSD in the acting state, which may cause a PG_TOO_DEGRADED
health warning that blocks Ceph OSD removal. We recommend setting
replicated.size to 3 for each Ceph pool.
-
Open the
CephDeploymentCR on for editing:kubectl -n pelagia edit cephdpl -
Remove the required Ceph OSD specification from the
spec.nodes.<nodeName>.deviceslist:The example configuration of the
nodessection with removing device:nodes: - name: storage-worker-52 roles: - mon - mgr devices: - config: deviceClass: hdd fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS - config: # remove the entire item entry from devices list deviceClass: hdd fullPath: /dev/disk/by-id/scsi-0ATA_HGST_HUS724040AL_PN1334PEHN1VBC -
Create a YAML template for the
CephOsdRemoveTaskCR. Select from the following options:-
Remove Ceph OSD by device name,
by-pathsymlink, orby-idsymlink:apiVersion: lcm.mirantis.com/v1alpha1 kind: CephOsdRemoveTask metadata: name: remove-osd-<nodeName>-task namespace: pelagia spec: nodes: <nodeName>: cleanupByDevice: - device: sdbWarning
We do not recommend setting device name or device
by-pathsymlink in thecleanupByDevicefield as these identifiers are not persistent and can change at node boot. Remove Ceph OSDs withby-idsymlinks or usecleanupByOsdIdinstead. For details, see Architecture: Addressing Ceph devices.Note
If a device was physically removed from a node,
cleanupByDeviceis not supported. Therefore, usecleanupByOsdIdinstead. -
Remove Ceph OSD by OSD ID:
apiVersion: lcm.mirantis.com/v1alpha1 kind: CephOsdRemoveTask metadata: name: remove-osd-<nodeName>-task namespace: pelagia spec: nodes: <nodeName>: cleanupByOsdId: - id: 2
-
-
Apply the template:
kubectl apply -f remove-osd-<nodeName>-task.yaml -
Verify that the corresponding request has been created:
kubectl -n pelagia get cephosdremovetask remove-osd-<nodeName>-task -
Verify that the
removeInfosection appeared in theCephOsdRemoveTaskCRstatus:kubectl -n pelagia get cephosdremovetask remove-osd-<nodeName>-task -o yamlExample of system response:
status: removeInfo: cleanupMap: storage-worker-52: osdMapping: "2": deviceMapping: sdb: path: "/dev/disk/by-path/pci-0000:00:1t.9" partition: "/dev/ceph-b-vg_sdb/osd-block-b-lv_sdb" type: "block" class: "hdd" zapDisk: true -
Verify that the
cleanupMapsection matches the required removal and wait for theApproveWaitingphase to appear instatus:kubectl -n pelagia get cephosdremovetask remove-osd-<nodeName>-task -o yamlExample of system response:
status: phase: ApproveWaiting -
Edit the
CephOsdRemoveTaskCR and set theapproveflag totrue:kubectl -n pelagia edit cephosdremovetask remove-osd-<nodeName>-taskFor example:
spec: approve: true -
Review the following
statusfields of the Ceph LCM CR processing:status.phase- current state of request processing;status.messages- description of the current phase;status.conditions- full history of request processing before the current phase;status.removeInfo.issuesandstatus.removeInfo.warnings- error and warning messages occurred during request processing, if any.
-
Verify that the
CephOsdRemoveTaskhas been completed. For example:status: phase: Completed # or CompletedWithWarnings if there are non-critical issues -
Remove the device cleanup jobs:
kubectl delete jobs -n pelagia -l app=pelagia-lcm-cleanup-disks
Reconfigure a Ceph OSD#
There is no hot reconfiguration procedure for existing Ceph OSDs. To reconfigure an existing Ceph OSD, follow the steps below:
- Remove a Ceph OSD from the Ceph cluster as described in Remove a Ceph OSD.
- Add the same Ceph OSD but with a modified configuration as described in Add a Ceph OSD.