Creating a Ceph OSD remove task#
The workflow of creating a Ceph OSD removal task includes the following steps:
-
Removing obsolete nodes or disks from the
spec.nodessection of theCephDeploymentcustom resource (CR) as described in Nodes parameters.Note
Note the names of the removed nodes, devices or their paths exactly as they were specified in
CephDeploymentfor further usage. -
Creating a YAML template for the
CephOsdRemoveTaskCR. For details, see CephOsdRemoveTask custom resource.- If
CephOsdRemoveTaskcontains information about Ceph OSDs to remove in a proper format, the information will be validated to eliminate human error and avoid a wrong Ceph OSD removal. - If the
nodessection ofCephOsdRemoveTaskis empty, the Pelagia LCM Controller will automatically detect Ceph OSDs for removal, if any. Auto-detection is based not only on the information provided in the RookCephClusterCR but also on the information from the Ceph cluster itself.
Once the validation or auto-detection completes, the entire information about the Ceph OSDs to remove appears in the
CephOsdRemoveTaskobject: hosts they belong to, OSD IDs, disks, partitions, and so on. The request then moves to theApproveWaitingphase until the cloud operator manually specifies theapproveflag in the spec.Example of the
CephOsdRemoveTaskcustom resourceapiVersion: lcm.mirantis.com/v1alpha1 kind: CephOsdRemoveTask metadata: name: remove-osd-3-4-task namespace: pelagia spec: nodes: worker-3: cleanupByDevice: - device: sdb - device: /dev/disk/by-path/pci-0000:00:1t.9Example of the
CephOsdRemoveTaskcustom resource to find all ready to remove Ceph OSDsapiVersion: lcm.mirantis.com/v1alpha1 kind: CephOsdRemoveTask metadata: generateName: remove-osds namespace: pelagia spec: nodes: {} - If
-
Manually adding an affirmative
approveflag in theCephOsdRemoveTaskspec. Once done, Pelagia Controllers and Rook Ceph Operator reconciliation pause until the task is handled and execute the following:- Stops regular Rook Ceph Operator orchestration. Also, Pelagia Deployment Controller pauses its reconcile.
- Removes Ceph OSDs.
- Runs batch jobs to clean up the device, if possible.
- Removes host information from the Ceph cluster if the entire Ceph node is removed.
- Marks the task with an appropriate result with a description of occurred issues.
Note
If the task completes successfully, Rook Ceph Operator and Pelagia Deployment Controller reconciliation resumes. Otherwise, it remains paused until the issue is resolved.
-
Reviewing the Ceph OSD removal status. For details, see Status fields
-
Manual removal of device cleanup jobs.
Note
Device cleanup jobs are not removed automatically and are kept in Pelagia namespace along with pods containing information about the executed actions. The jobs have the following labels:
labels: app: pelagia-lcm-cleanup-disks host: <HOST-NAME> osd: <OSD-ID> rook-cluster: <ROOK-CLUSTER-NAME>Additionally, jobs are labeled with disk names that will be cleaned up, such as
sdb=true. You can remove a single job or a group of jobs using any label described above, such as host, disk, and so on.