CephDeployment Custom Resource#
This section describes how to configure a Ceph cluster using the CephDeployment
(cephdeployments.lcm.mirantis.com
) custom resource (CR).
The CephDeployment
CR spec specifies the nodes to deploy as Ceph components.
Based on the roles definitions in the CephDeployment
CR, Pelagia Deployment Controller
automatically labels nodes for Ceph Monitors and Managers. Ceph OSDs are
deployed based on the devices
parameter defined for each Ceph node.
For the default CephDeployment
CR, see the following example:
Example configuration of Ceph specification
apiVersion: lcm.mirantis.com/v1alpha1
kind: CephDeployment
metadata:
name: pelagia-ceph
namespace: pelagia
spec:
nodes:
- name: cluster-storage-controlplane-0
roles:
- mgr
- mon
- mds
- name: cluster-storage-controlplane-1
roles:
- mgr
- mon
- mds
- name: cluster-storage-controlplane-2
roles:
- mgr
- mon
- mds
- name: cluster-storage-worker-0
roles: []
devices:
- config:
deviceClass: ssd
fullPath: /dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231434939
- name: cluster-storage-worker-1
roles: []
devices:
- config:
deviceClass: ssd
fullPath: /dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231440912
- name: cluster-storage-worker-2
roles: []
devices:
- config:
deviceClass: ssd
fullPath: /dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231443409
pools:
- default: true
deviceClass: ssd
name: kubernetes
replicated:
size: 3
objectStorage:
rgw:
name: rgw-store
dataPool:
deviceClass: ssd
replicated:
size: 3
metadataPool:
deviceClass: ssd
replicated:
size: 3
gateway:
allNodes: false
instances: 3
port: 8081
securePort: 8443
preservePoolsOnDelete: false
sharedFilesystem:
cephFS:
- name: cephfs-store
dataPools:
- name: cephfs-pool-1
deviceClass: ssd
replicated:
size: 3
metadataPool:
deviceClass: ssd
replicated:
size: 3
metadataServer:
activeCount: 1
activeStandby: false
Configure a Ceph cluster with CephDeployment#
-
Select from the following options:
- If you do not have a Ceph cluster yet, create
cephdeployment.yaml
for editing. - If the Ceph cluster is already deployed, open the
CephDeployment
CR for editing:
kubectl -n pelagia edit cephdpl
- If you do not have a Ceph cluster yet, create
-
Using the tables below, configure the Ceph cluster as required:
-
Select from the following options:
- If you are creating Ceph cluster, save the updated
CephDeployment
template to the corresponding file and apply the file to a cluster:kubectl apply -f cephdeployment.yaml
- If you are editing
CephDeployment
, save the changes and exit the text editor to apply it.
- If you are creating Ceph cluster, save the updated
-
Verify
CephDeployment
reconcile status with Status fields.
CephDeployment configuration options#
The following subsections contain a description of CephDeployment
parameters for an
advanced configuration.
General parameters #
Parameter |
Description |
---|---|
network |
Specifies access and public networks for the Ceph cluster. For details, see Network parameters. |
nodes |
Specifies the list of Ceph nodes. For details, see Node parameters. The nodes parameter is a list with Ceph node specifications. List item could define Ceph node specification for a single node or a group of nodes listed or defined by label. It could be also combined. |
pools |
Specifies the list of Ceph pools. For details, see Pool parameters. |
clients |
List of Ceph clients. For details, see Clients parameters. |
objectStorage |
Specifies the parameters for Object Storage, such as RADOS Gateway, the Ceph Object Storage. Also specifies the RADOS Gateway Multisite configuration. For details, see RADOS Gateway parameters and Multisite parameters. |
ingressConfig |
Enables a custom ingress rule for public access on Ceph services, for example, Ceph RADOS Gateway. For details, see Configure Ceph Object Gateway TLS. |
sharedFilesystem |
Enables Ceph Filesystem. For details, see CephFS parameters. |
rookConfig |
String key-value parameter that allows overriding Ceph configuration options. For details, see RookConfig parameters. |
healthCheck |
Configures health checks and liveness probe settings for Ceph daemons. For details, see Health check parameters. |
extraOpts |
Enables specification of extra options for a setup, includes the deviceLabels parameter. Refer to Extra options for details. |
mgr |
Specifies a list of Ceph Manager modules to be enabled or disabled. For details, see Manager modules parameters. Modules balancer and pg_autoscaler are enabled by default. |
dashboard |
Enables Ceph dashboard. Currently, Pelagia has no support of Ceph Dashboard. Defaults to false . |
rbdMirror |
Specifies the parameters for RBD Mirroring. For details, see RBD Mirroring parameters. |
external |
Enables external Ceph cluster mode. If enabled, Pelagia will read a special Secret with external Ceph cluster credentials data connect to. |
Network parameters #
-
clusterNet
- specifies a Classless Inter-Domain Routing (CIDR) for the Ceph OSD replication network.Warning
To avoid ambiguous behavior of Ceph daemons, do not specify
0.0.0.0/0
inclusterNet
. Otherwise, Ceph daemons can select an incorrect public interface that can cause the Ceph cluster to become unavailable.Note
The
clusterNet
andpublicNet
parameters support multiple IP networks. For details, see Ops Guide: Enable Multinetworking. -
publicNet
- specifies a CIDR for communication between the service and operator.Warning
To avoid ambiguous behavior of Ceph daemons, do not specify
0.0.0.0/0
inpublicNet
. Otherwise, Ceph daemons can select an incorrect public interface that can cause the Ceph cluster to become unavailable.Note
The
clusterNet
andpublicNet
parameters support multiple IP networks. For details, see Ops Guide: Enable Multinetworking.
Example configuration:
spec:
network:
clusterNet: 10.10.0.0/24
publicNet: 192.100.0.0/24
Nodes parameters #
-
name
- Mandatory. Specifies the following:- If node spec implies to be deployed on a single node,
name
stands for node name where Ceph node should be deployed. For example, it could becluster-storage-worker-0
. - If node spec implies to be deployed on a group of nodes,
name
stands for group name, for examplegroup-rack-1
. In that case, Ceph node specification must contain eithernodeGroup
ornodesByLabel
fields defined.
- If node spec implies to be deployed on a single node,
-
nodeGroup
- Optional. Specifies the list of nodes and used for specifying Ceph node specification for a group of nodes from the list. For example:spec: nodes: - name: group-1 nodeGroup: [node-X, node-Y]
-
nodesByLabel
- Optional. Specifies label expression and used for specifying Ceph node specification for a group of nodes found by label. For example:spec: nodes: - name: group-1 nodesByLabel: "ceph-storage-node=true,!ceph-control-node"
-
roles
- Optional. Specifies themon
,mgr
,rgw
ormds
daemon to be installed on a Ceph node. You can place the daemons on any nodes upon your decision. Consider the following recommendations:- The recommended number of Ceph Monitors in a Ceph cluster is 3.
Therefore, at least three Ceph nodes must contain the
mon
item in theroles
parameter. - The number of Ceph Monitors must be odd.
- Do not add more than three Ceph Monitors at a time and wait until the
Ceph cluster is
Ready
before adding more daemons. - For better HA and fault tolerance, the number of
mgr
roles must equal the number ofmon
roles. Therefore, we recommend labeling at least three Ceph nodes with themgr
role. - If
rgw
roles are not specified, allrgw
daemons will spawn on the same nodes withmon
daemons.
If a Ceph node contains a
mon
role, the Ceph Monitor Pod deploys on this node.If a Ceph node contains a
mgr
role, it informs the Ceph Controller that a Ceph Manager can be deployed on the node. Rook Operator selects the first available node to deploy the Ceph Manager on it. Pelagia supports deploying two Ceph Managers in total: one active and one stand-by.If you assign the
mgr
role to three recommended Ceph nodes, one back-up Ceph node is available to redeploy a failed Ceph Manager in a case of a node outage. - The recommended number of Ceph Monitors in a Ceph cluster is 3.
Therefore, at least three Ceph nodes must contain the
-
monitorIP
- Optional. If defined, specifies a custom IP address for Ceph Monitor which should be placed on the node. If not set, Ceph Monitor on the node will use defaulthostNetwork
address of a node. General recommendation is to use IP address from Ceph public network address range.Note
To update
monitorIP
the corresponding Ceph Monitor daemon should be re-created. -
config
- Mandatory. Specifies a map of device configurations that must contain a mandatorydeviceClass
parameter set tohdd
,ssd
, ornvme
. Applied for all OSDs on the current node. Can be overridden by theconfig
parameter of each device defined in thedevices
parameter.For details, see Rook documentation: OSD config settings.
-
devices
- Optional. Specifies the list of devices to use for Ceph OSD deployment. Includes the following parameters:Note
Recommending to use the
fullPath
field for definingby-id
symlinks as persistent device identifiers. For details, see Addressing Ceph devices.-
fullPath
- a storage device symlink. Accepts the following values:-
The device
by-id
symlink that contains the serial number of the physical device and does not containwwn
. For example,/dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
. -
The device
by-path
symlink. For example,/dev/disk/by-path/pci-0000:00:11.4-ata-3
. We do not recommend specifying storage devices with deviceby-path
symlinks because such identifiers are not persistent and can change at node boot.
This parameter is mutually exclusive with
name
. -
-
name
- a storage device name. Accepts the following values:- The device name, for example,
sdc
. We do not recommend specifying storage devices with device names because such identifiers are not persistent and can change at node boot. - The device
by-id
symlink that contains the serial number of the physical device and does not containwwn
. For example,/dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
. - The device label from
extraOpts.deviceLabels
section which is generally used for templating Ceph node specification for node groups. For details, see Extra options.
This parameter is mutually exclusive with
fullPath
. - The device name, for example,
-
config
- a map of device configurations that must contain a mandatorydeviceClass
parameter set tohdd
,ssd
, ornvme
. The device class must be defined in a pool and can optionally contain a metadata device, for example:spec: nodes: - name: <node-a> devices: - fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS config: deviceClass: hdd metadataDevice: /dev/meta-1/nvme-dev-1 osdsPerDevice: "2"
The underlying storage format to use for Ceph OSDs is BlueStore.
The
metadataDevice
parameter accepts a device name or logical volume path for the BlueStore device. We recommend using logical volume paths created onnvme
devices.The
osdsPerDevice
parameter accepts the string-type natural numbers and allows splitting one device on several Ceph OSD daemons. We recommend using this parameter only forssd
ornvme
disks.
-
-
deviceFilter
- Optional. Specifies regexp by names of devices to use for Ceph OSD deployment. Mutually exclusive withdevices
anddevicePathFilter
. Requires theconfig
parameter withdeviceClass
specified. For example:spec: nodes: - name: <node-a> deviceFilter: "^sd[def]$" config: deviceClass: hdd
For more details, see Rook documentation: Storage selection settings.
-
devicePathFilter
- Optional. Specifies regexp by paths of devices to use for Ceph OSD deployment. Mutually exclusive withdevices
anddeviceFilter
. Requires theconfig
parameter withdeviceClass
specified. For example:spec: nodes: - name: <node-a> devicePathFilter: "^/dev/disk/by-id/scsi-SATA.+$" config: deviceClass: hdd
For more details, see Rook documentation: Storage selection settings.
-
crush
- Optional. Specifies the explicit key-value CRUSH topology for a node. For details, see Ceph documentation: CRUSH maps. Includes the following parameters:datacenter
- a physical data center that consists of rooms and handles data.room
- a room that accommodates one or more racks with hosts.pdu
- a power distribution unit (PDU) device that has multiple outputs and distributes electric power to racks located within a data center.row
- a row of computing racks insideroom
.rack
- a computing rack that accommodates one or more hosts.chassis
- a bare metal structure that houses or physically assembles hosts.region
- the geographic location of one or more Ceph Object instances within one or more zones.zone
- a logical group that consists of one or more Ceph Object instances.
Example configuration:
spec: nodes: - name: <node-a> crush: datacenter: dc1 room: room1 pdu: pdu1 row: row1 rack: rack1 chassis: ch1 region: region1 zone: zone1
Pools parameters #
name
- Mandatory. Specifies the pool name as a prefix for each Ceph block pool. The resulting Ceph block pool name will be<name>-<deviceClass>
.useAsFullName
- Optional. Enables Ceph block pool to use only thename
value as a name. The resulting Ceph block pool name will be<name>
without thedeviceClass
suffix.role
- Optional. Specifies the pool role for Rockoon integration.preserveOnDelete
- Optional. Enables skipping Ceph pool delete onpools
section item removal. Ifpools
section item removed with this flag enabled, relatedCephBlockPool
object would be kept untouched and will require manual deletion on demand. Defaulted tofalse
.-
storageClassOpts
- Optional. Allows to configure parameters for storage class, created for RBD pool. Includes the following parameters:default
- Optional. Defines whether the pool and dependent StorageClass must be set as default. Must be enabled only for one pool. Defaults tofalse
.mapOptions
- Optional. Not updatable as it applies only once. Specifies customrbd device map
options to use withStorageClass
of a corresponding pool. Allows customizing the Kubernetes CSI driver interaction with Ceph RBD for the definedStorageClass
. For available options, see Ceph documentation: Kernel RBD (KRBD) options.unmapOptions
- Optional. Not updatable as it applies only once. Specifies customrbd device unmap
options to use withStorageClass
of a corresponding pool. Allows customizing the Kubernetes CSI driver interaction with Ceph RBD for the definedStorageClass
. For available options, see Ceph documentation: Kernel RBD (KRBD) options.imageFeatures
- Optional. Not updatable as it applies only once. Specifies is a comma-separated list of RBD image features, see Ceph documentation: Manage Rados block device (RBD) images.reclaimPolicy
- Optional. Specifies reclaim policy for the underlyingStorageClass
of the pool. AcceptsRetain
andDelete
values. Default isDelete
if not set.-
allowVolumeExpansion
- Optional. Not updatable as it applies only once. Enables expansion of persistent volumes based onStorageClass
of a corresponding pool. For details, see Kubernetes documentation: Resizing persistent volumes using Kubernetes.Note
A Kubernetes cluster only supports increase of storage size.
-
deviceClass
- Mandatory. Specifies the device class for the defined pool. Common possible values arehdd
,ssd
andnvme
. Also allows customized device classes, refers to Extra options. -
replicated
- Thereplicated
parameter is mutually exclusive witherasureCoded
and includes the following parameters:size
- the number of pool replicas.-
targetSizeRatio
- A float percentage from0.0
to1.0
, which specifies the expected consumption of the total Ceph cluster capacity. The default values are as follows:- The default ratio of the Ceph Object Storage
dataPool
10.0%. - Target ratios for the pools required for Rockoon, described in Ops Guide: Integrate Pelagia with Rockoon.
Note
Mirantis recommends defining target ratio with the
parameters.target_size_ratio
string field instead. - The default ratio of the Ceph Object Storage
-
erasureCoded
- Enables the erasure-coded pool. For details, see Rook documentation: Erasure Coded RBD Pool. and Ceph documentation: Erasure coded pool. TheerasureCoded
parameter is mutually exclusive withreplicated
. -
failureDomain
- Optional. The failure domain across which the replicas or chunks of data will be spread. Set tohost
by default. The list of possible recommended values includes:host
,rack
,room
, anddatacenter
.Caution
We do not recommend using the following intermediate topology keys:
pdu
,row
,chassis
. Consider therack
topology instead. Theosd
failure domain is prohibited. -
mirroring
- Optional. Enables the mirroring feature for the defined pool. Includes themode
parameter that can be set topool
orimage
. For details, see Ops Guide: Enable RBD Mirroring. parameters
- Optional. Specifies the key-value map for the parameters of the Ceph pool. For details, see Ceph documentation: Set Pool values.enableCrushUpdates
- Optional. Enables automatic updates of the CRUSH map when the pool is created or updated. Defaulted tofalse
.
Example configuration of Pools specification
spec:
pools:
- name: kubernetes
deviceClass: hdd
replicated:
size: 3
parameters:
target_size_ratio: "10.0"
storageClassOpts:
default: true
preserveOnDelete: true
- name: kubernetes
deviceClass: nvme
erasureCoded:
codingChunks: 1
dataChunks: 2
failureDomain: host
- name: archive
useAsFullName: true
deviceClass: hdd
failureDomain: rack
replicated:
size: 3
As a result, the following Ceph pools will be created: kubernetes-hdd
, kubernetes-nvme
, and archive
.
To configure additional required pools for Rockoon, see Ops Guide: Integrate Pelagia with Rockoon.
Caution
Since Ceph Pacific, Ceph CSI driver does not propagate the 777
permission on the mount point of persistent volumes based on any
StorageClass
of the Ceph pool.
Clients parameters #
name
- Mandatory. Ceph client name.caps
- Mandatory. Key-value parameter with Ceph client capabilities. For details aboutcaps
, refer to Ceph documentation: Authorization (capabilities).
Example configuration of Clients specification
spec:
clients:
- name: test-client
caps:
mon: allow r, allow command "osd blacklist"
osd: profile rbd pool=kubernetes-nvme
RADOS Gateway parameters #
name
- Required. Ceph Object Storage instance name.-
dataPool
- Required ifzone.name
is not specified. Mutually exclusive withzone
. Must be used together withmetadataPool
.Object storage data pool spec that must only contain
replicated
orerasureCoded
,deviceClass
andfailureDomain
parameters. ThefailureDomain
parameter may be set tohost
,rack
,room
, ordatacenter
, defining the failure domain across which the data will be spread. ThedeviceClass
must be explicitly defined. FordataPool
, We recommend using anerasureCoded
pool.spec: objectStorage: rgw: dataPool: deviceClass: hdd failureDomain: host erasureCoded: codingChunks: 1 dataChunks: 2
-
metadataPool
- Required ifzone.name
is not specified. Mutually exclusive withzone
. Must be used together withdataPool
.Object storage metadata pool spec that must only contain
replicated
,deviceClass
andfailureDomain
parameters. ThefailureDomain
parameter may be set tohost
,rack
,room
, ordatacenter
, defining the failure domain across which the data will be spread. ThedeviceClass
must be explicitly defined. Can use onlyreplicated
settings. For example:spec: objectStorage: rgw: metadataPool: deviceClass: hdd failureDomain: host replicated: size: 3
where
replicated.size
is the number of full copies of data on multiple nodes.Warning
When using the non-recommended Ceph pools
replicated.size
of less than3
, Ceph OSD removal cannot be performed. The minimal replica size equals a rounded up half of the specifiedreplicated.size
.For example, if
replicated.size
is2
, the minimal replica size is1
, and ifreplicated.size
is3
, then the minimal replica size is2
. The replica size of1
allows Ceph having PGs with only one Ceph OSD in theacting
state, which may cause aPG_TOO_DEGRADED
health warning that blocks Ceph OSD removal. We recommend settingreplicated.size
to3
for each Ceph pool. -
gateway
- Required. The gateway settings corresponding to thergw
daemon settings. Includes the following parameters:port
- the port on which the Ceph RGW service will be listening on HTTP.securePort
- the port on which the Ceph RGW service will be listening on HTTPS.-
instances
- the number of pods in the Ceph RGW ReplicaSet. IfallNodes
is set totrue
, a DaemonSet is created instead.Note
We recommend using 3 instances for Ceph Object Storage.
-
allNodes
- defines whether to start the Ceph RGW pods as a DaemonSet on all nodes. Theinstances
parameter is ignored ifallNodes
is set totrue
. splitDaemonForMultisiteTrafficSync
- Optional. For multisite setup defines whether to split RGW daemon on daemon responsible for sync between zones and daemon for serving clients request.rgwSyncPort
- Optional. Port the rgw multisite traffic service will be listening on (http). Has effect only for multisite configuration.resources
- Optional. Represents Kubernetes resource requirements for Ceph RGW pods. For details see: Kubernetes docs: Resource Management for Pods and Containers.-
externalRgwEndpoint
- Required for external Ceph cluster Setup. Represents external RGW Endpoint to use, only when external Ceph cluster is used. Contains the following parameters:ip
- represents the IP address of RGW endpoint.hostname
- represents the DNS-addressable hostname of RGW endpoint. This field will be preferred over IP if both are given.
spec: objectStorage: rgw: gateway: allNodes: false instances: 3 port: 80 securePort: 8443
-
preservePoolsOnDelete
- Optional. Defines whether to delete the data and metadata pools in thergw
section if the Object Storage is deleted. Set this parameter totrue
if you need to store data even if the object storage is deleted. However, we recommend setting this parameter tofalse
. -
objectUsers
andbuckets
- Optional. To create new Ceph RGW resources, such as buckets or users, specify the following keys. Ceph Controller will automatically create the specified object storage users and buckets in the Ceph cluster.-
objectUsers
- a list of user specifications to create for object storage. Contains the following fields:name
- a user name to create.displayName
- the Ceph user name to display.-
capabilities
- user capabilities:user
- admin capabilities to read/write Ceph Object Store users.bucket
- admin capabilities to read/write Ceph Object Store buckets.metadata
- admin capabilities to read/write Ceph Object Store metadata.usage
- admin capabilities to read/write Ceph Object Store usage.zone
- admin capabilities to read/write Ceph Object Store zones.
The available options are
*
,read
,write
,read, write
. For details, see Ceph documentation: Add/remove admin capabilities. -
quotas
- user quotas:maxBuckets
- the maximum bucket limit for the Ceph user. Integer, for example,10
.maxSize
- the maximum size limit of all objects across all the buckets of a user. String size, for example,10G
.maxObjects
- the maximum number of objects across all buckets of a user. Integer, for example,10
.
spec: objectStorage: rgw: objectUsers: - name: test-user displayName: test-user capabilities: bucket: '*' metadata: read user: read quotas: maxBuckets: 10 maxSize: 10G
-
buckets
- a list of strings that contain bucket names to create for object storage.
-
-
zone
- Required ifdataPool
andmetadataPool
are not specified. Mutually exclusive with these parameters.Defines the Ceph Multisite zone where the object storage must be placed. Includes the
name
parameter that must be set to one of thezones
items. For details, see the Ops Guide: Enable Multisite for Ceph Object Storage.spec: objectStorage: multisite: zones: - name: master-zone ... rgw: zone: name: master-zone
-
SSLCert
- Optional. Custom TLS certificate parameters used to access the Ceph RGW endpoint. If not specified, a self-signed certificate will be generated.spec: objectStorage: rgw: SSLCert: cacert: | -----BEGIN CERTIFICATE----- ca-certificate here -----END CERTIFICATE----- tlsCert: | -----BEGIN CERTIFICATE----- private TLS certificate here -----END CERTIFICATE----- tlsKey: | -----BEGIN RSA PRIVATE KEY----- private TLS key here -----END RSA PRIVATE KEY-----
-
SSLCertInRef
- Optional. Flag to determine that a TLS certificate for accessing the Ceph RGW endpoint is used but not exposed inspec
. For example:spec: objectStorage: rgw: SSLCertInRef: true
The operator must manually provide TLS configuration using the
rgw-ssl-certificate
secret in therook-ceph
namespace of the managed cluster. The secret object must have the following structure:data: cacert: <base64encodedCaCertificate> cert: <base64encodedCertificate>
When removing an already existing
SSLCert
block, no additional actions are required, because this block uses the samergw-ssl-certificate
secret in therook-ceph
namespace.When adding a new secret directly without exposing it in
spec
, the following rules apply:cert
- base64 representation of a file with the server TLS key, server TLS cert, and CA certificate.cacert
- base64 representation of a CA certificate only.
Example configuration of RADOS gateway specification
spec:
objectStorage:
rgw:
name: rgw-store
dataPool:
deviceClass: hdd
erasureCoded:
codingChunks: 1
dataChunks: 2
failureDomain: host
metadataPool:
deviceClass: hdd
failureDomain: host
replicated:
size: 3
gateway:
allNodes: false
instances: 3
port: 80
securePort: 8443
preservePoolsOnDelete: false
RADOS Gateway Multisite parameters #
Technical Preview
-
realms
- Required. List of realms to use, represents the realm namespaces. Includes the following parameters:name
- required, the realm name.-
pullEndpoint
- optional, required only when the master zone is in a different storage cluster. The endpoint, access key, and system key of the system user from the realm to pull from. Includes the following parameters:endpoint
- the endpoint of the master zone in the master zone group.accessKey
- the access key of the system user from the realm to pull from.secretKey
- the system key of the system user from the realm to pull from.
-
zoneGroups
- Required. The list of zone groups for realms. Includes the following parameters:name
- required, the zone group name.realmName
- required, the realm namespace name to which the zone group belongs to.
-
zones
- Required. The list of zones used within one zone group. Includes the following parameters:name
- required, the zone name.metadataPool
- required, the settings used to create the Object Storage metadata pools. Must use replication. For details, see description of Pool parameters.dataPool
- required, the settings used to create the Object Storage data pool. Can use replication or erasure coding. For details, see Pool parameters.zoneGroupName
- required, the zone group name.endpointsForZone
- optional. The list of all endpoints in the zone group. If you use ingress proxy for RGW, the list of endpoints must contain that FQDN/IP address to access RGW. By default, if no ingress proxy is used, the list of endpoints is set to the IP address of the RGW external service. Endpoints must follow the HTTP URL format.
For configuration example, see Ops Guide: Enable Multisite for Ceph Object Storage.
CephFS parameters #
sharedFilesystem
contains a list of Ceph Filesystems cephFS
. Each cephFS
item
contains the following parameters:
name
- Mandatory. CephFS instance name.-
dataPools
- A list of CephFS data pool specifications. Each spec contains thename
,replicated
orerasureCoded
,deviceClass
, andfailureDomain
parameters. The first pool in the list is treated as the default data pool for CephFS and must always bereplicated
. The number of data pools is unlimited, but the default pool must always be present. For example:spec: sharedFilesystem: cephFS: - name: cephfs-store dataPools: - name: default-pool deviceClass: ssd replicated: size: 3 failureDomain: host - name: second-pool deviceClass: hdd failureDomain: rack erasureCoded: dataChunks: 2 codingChunks: 1
where
replicated.size
is the number of full copies of data on multiple nodes.Warning
When using the non-recommended Ceph pools
replicated.size
of less than3
, Ceph OSD removal cannot be performed. The minimal replica size equals a rounded up half of the specifiedreplicated.size
.For example, if
replicated.size
is2
, the minimal replica size is1
, and ifreplicated.size
is3
, then the minimal replica size is2
. The replica size of1
allows Ceph having PGs with only one Ceph OSD in theacting
state, which may cause aPG_TOO_DEGRADED
health warning that blocks Ceph OSD removal. We recommend settingreplicated.size
to3
for each Ceph pool.Warning
Modifying of
dataPools
on a deployed CephFS has no effect. You can manually adjust pool settings through the Ceph CLI. However, for any changes indataPools
, we recommend re-creating CephFS. -
metadataPool
- CephFS metadata pool spec that should only containreplicated
,deviceClass
, andfailureDomain
parameters. Can use onlyreplicated
settings. For example:spec: sharedFilesystem: cephFS: - name: cephfs-store metadataPool: deviceClass: nvme replicated: size: 3 failureDomain: host
where
replicated.size
is the number of full copies of data on multiple nodes.Warning
Modifying of
metadataPool
on a deployed CephFS has no effect. You can manually adjust pool settings through the Ceph CLI. However, for any changes inmetadataPool
, we recommend re-creating CephFS. -
preserveFilesystemOnDelete
- Defines whether to delete the data and metadata pools if CephFS is deleted. Set totrue
to avoid occasional data loss in case of human error. However, for security reasons, we recommend settingpreserveFilesystemOnDelete
tofalse
. -
metadataServer
- Metadata Server settings correspond to the Ceph MDS daemon settings. Contains the following fields:activeCount
- the number of active Ceph MDS instances. As load increases, CephFS will automatically partition the file system across the Ceph MDS instances. Rook will create double the number of Ceph MDS instances as requested byactiveCount
. The extra instances will be in the standby mode for failover. We recommend specifying this parameter to1
and increasing the MDS daemons count only in case of high load.activeStandby
- defines whether the extra Ceph MDS instances will be in active standby mode and will keep a warm cache of the file system metadata for faster failover. The instances will be assigned by CephFS in failover pairs. Iffalse
, the extra Ceph MDS instances will all be in passive standby mode and will not maintain a warm cache of the metadata. The default value isfalse
.resources
- represents Kubernetes resource requirements for Ceph MDS pods. For details see: Kubernetes docs: Resource Management for Pods and Containers.
spec: sharedFilesystem: cephFS: - name: cephfs-store metadataServer: activeCount: 1 activeStandby: false resources: # example, non-prod values requests: memory: 1Gi cpu: 1 limits: memory: 2Gi cpu: 2
Example configuration of shared Filesystem specification
spec:
sharedFilesystem:
cephFS:
- name: cephfs-store
dataPools:
- name: cephfs-pool-1
deviceClass: hdd
replicated:
size: 3
failureDomain: host
metadataPool:
deviceClass: nvme
replicated:
size: 3
failureDomain: host
metadataServer:
activeCount: 1
activeStandby: false
RookConfig parameters #
String key-value parameter that allows overriding Ceph configuration options.
Use the |
delimiter to specify the section where a parameter
must be placed. For example, mon
or osd
. And, if required,
use the .
delimiter to specify the exact number of the Ceph OSD or
Ceph Monitor to apply an option to a specific mon
or osd
and
override the configuration of the corresponding section.
The use of this option enables restart of only specific daemons related
to the corresponding section. If you do not specify the section,
a parameter is set in the global
section, which includes restart
of all Ceph daemons except Ceph OSD.
spec:
rookConfig:
"osd_max_backfills": "64"
"mon|mon_health_to_clog": "true"
"osd|osd_journal_size": "8192"
"osd.14|osd_journal_size": "6250"
HealthCheck parameters #
-
daemonHealth
- Optional. Specifies health check settings for Ceph daemons. Contains the following parameters:status
- configures health check settings for Ceph healthmon
- configures health check settings for Ceph Monitorsosd
- configures health check settings for Ceph OSDs
Each parameter allows defining the following settings:
disabled
- a flag that disables the health check.interval
- an interval in seconds or minutes for the health check to run. For example,60s
for 60 seconds.timeout
- a timeout for the health check in seconds or minutes. For example,60s
for 60 seconds.
-
livenessProbe
- Optional. Key-value parameter with liveness probe settings for the defined daemon types. Can be one of the following:mgr
,mon
,osd
, ormds
. Includes thedisabled
flag and theprobe
parameter. Theprobe
parameter accepts the following options:initialDelaySeconds
- the number of seconds after the container has started before the liveness probes are initiated. Integer.timeoutSeconds
- the number of seconds after which the probe times out. Integer.periodSeconds
- the frequency (in seconds) to perform the probe. Integer.successThreshold
- the minimum consecutive successful probes for the probe to be considered successful after a failure. Integer.failureThreshold
- the minimum consecutive failures for the probe to be considered failed after having succeeded. Integer.
Note
Pelagia Deployment Controller specifies the following
livenessProbe
defaults formon
,mgr
,osd
, andmds
(if CephFS is enabled):5
fortimeoutSeconds
5
forfailureThreshold
-
startupProbe
- Optional. Key-value parameter with startup probe settings for the defined daemon types. Can be one of the following:mgr
,mon
,osd
, ormds
. Includes thedisabled
flag and theprobe
parameter. Theprobe
parameter accepts the following options:timeoutSeconds
- the number of seconds after which the probe times out. Integer.periodSeconds
- the frequency (in seconds) to perform the probe. Integer.successThreshold
- the minimum consecutive successful probes for the probe to be considered successful after a failure. Integer.failureThreshold
- the minimum consecutive failures for the probe to be considered failed after having succeeded. Integer.
Example configuration of health check specification
spec:
healthCheck:
daemonHealth:
mon:
disabled: false
interval: 45s
timeout: 600s
osd:
disabled: false
interval: 60s
status:
disabled: true
livenessProbe:
mon:
disabled: false
probe:
timeoutSeconds: 10
periodSeconds: 3
successThreshold: 3
mgr:
disabled: false
probe:
timeoutSeconds: 5
failureThreshold: 5
osd:
probe:
initialDelaySeconds: 5
timeoutSeconds: 10
failureThreshold: 7
startupProbe:
mon:
disabled: true
mgr:
probe:
successThreshold: 3
ExtraOpts parameters #
-
deviceLabels
- Optional. A key-value mapping which is used to assign a specification label to any available device on a specific node. These labels can then be used for thenodes
section items withnodeGroup
ornodesByLabel
defined to eliminate the need to specify different devices for each node individually. Additionally, it helps in avoiding the use of device names, facilitating the grouping of nodes with similar labels.Example usage:
spec: extraOpts: deviceLabels: <node-name>: <dev-label>: /dev/disk/by-id/<unique_ID> ... <dev-label-n>: /dev/disk/by-id/<unique_ID> ... <node-name-n>: <dev-label>: /dev/disk/by-id/<unique_ID> ... <dev-label-n>: /dev/disk/by-id/<unique_ID> nodes: - name: <group-name> devices: - name: <dev_label> - name: <dev_label_n> nodes: - <node_name> - <node_name_n>
-
customDeviceClasses
- Optional. TechPreview. A list of custom device class names to use in the specification. Enables you to specify the custom names different from the default ones, which includessd
,hdd
, andnvme
, and use them in nodes and pools definitions.Example usage:
spec: extraOpts: customDeviceClasses: - <custom_class_name> nodes: - name: kaas-node-5bgk6 devices: - config: # existing item deviceClass: <custom_class_name> fullPath: /dev/disk/by-id/<unique_ID> pools: - default: false deviceClass: <custom_class_name> erasureCoded: codingChunks: 1 dataChunks: 2 failureDomain: host
Manager modules parameters #
CephDeployment
specification mgr
section contains mgrModules
parameter. It includes the following
parameters:
name
- Ceph Manager module name.-
enabled
- Flag that defines whether the Ceph Manager module is enabled.For example:
spec: mgr: mgrModules: - name: balancer enabled: true - name: pg_autoscaler enabled: true
The
balancer
andpg_autoscaler
Ceph Manager modules are enabled by default and cannot be disabled.
Note
Most Ceph Manager modules require additional configuration that you can perform through the pelagia-lcm-tooblox
pod.
RBD Mirroring parameters #
daemonsCount
- Count ofrbd-mirror
daemons to spawn. We recommend using one instance of therbd-mirror
daemon. |-
peers
- Optional. List of mirroring peers of an external cluster to connect to. Only a single peer is supported. Thepeer
section includes the following parameters:site
- the label of a remote Ceph cluster associated with the token.token
- the token that will be used by one site (Ceph cluster) to pull images from the other site. To obtain the token, use the rbd mirror pool peer bootstrap create command.pools
- optional, a list of pool names to mirror.
Status fields #
Field | Description |
---|---|
phase |
Current handling phase of the applied Ceph cluster spec. Can equal to Creating , Deploying , Validation , Ready , Deleting , OnHold or Failed . |
message |
Detailed description of the current phase or an error message if the phase is Failed . |
lastRun |
DateTime when previous spec reconcile occurred. |
clusterVersion |
Current Ceph cluster version, for example, v19.2.3 . |
validation |
Validation result (Succeed or Failed ) of the spec with a list of messages, if any. The validation section includes the following fields:
|
objRefs |
Pelagia API object refereneces such as CephDeploymentHealth and CephDeploymentSecret . |