Configure Ceph Shared File System (CephFS)#
The Ceph Shared File System, or CephFS, provides the ability to create
read/write shared file system Persistent Volumes (PVs). These PVs support the
ReadWriteMany access mode for the FileSystem volume mode.
CephFS deploys its own daemons called MetaData Servers or Ceph MDS. For
details, see Ceph Documentation: Ceph File System.
Note
By design, CephFS data pool and metadata pool must be replicated only.
CephFS specification parameters #
The CephDeployment custom resource (CR) spec includes the sharedFilesystem.cephFS section
with the following CephFS parameters:
name- CephFS instance name.-
dataPools- A list of CephFS data pool specifications. Each spec contains thename,replicatedorerasureCoded,deviceClass, andfailureDomainparameters. The first pool in the list is treated as the default data pool for CephFS and must always bereplicated. ThefailureDomainparameter may be set tohost,rackand so on, defining the failure domain across which the data will be spread. The number of data pools is unlimited, but the default pool must always be present. For example:spec: sharedFilesystem: cephFS: - name: cephfs-store dataPools: - name: default-pool deviceClass: ssd replicated: size: 3 failureDomain: host - name: second-pool deviceClass: hdd failureDomain: rack erasureCoded: dataChunks: 2 codingChunks: 1where
replicated.sizeis the number of full copies of data on multiple nodes.Warning
When using the non-recommended Ceph pools
replicated.sizeof less than3, Ceph OSD removal cannot be performed. The minimal replica size equals a rounded up half of the specifiedreplicated.size.For example, if
replicated.sizeis2, the minimal replica size is1, and ifreplicated.sizeis3, then the minimal replica size is2. The replica size of1allows Ceph having PGs with only one Ceph OSD in theactingstate, which may cause aPG_TOO_DEGRADEDhealth warning that blocks Ceph OSD removal. We recommend settingreplicated.sizeto3for each Ceph pool.Warning
Modifying of
dataPoolson a deployed CephFS has no effect. You can manually adjust pool settings through the Ceph CLI. However, for any changes indataPools, we recommend re-creating CephFS. -
metadataPool- CephFS metadata pool spec that should only containreplicated,deviceClass, andfailureDomainparameters. ThefailureDomainparameter may be set tohost,rackand so on, defining the failure domain across which the data will be spread. Can use onlyreplicatedsettings. For example:spec: sharedFilesystem: cephFS: - name: cephfs-store metadataPool: deviceClass: nvme replicated: size: 3 failureDomain: hostwhere
replicated.sizeis the number of full copies of data on multiple nodes.Warning
Modifying of
metadataPoolon a deployed CephFS has no effect. You can manually adjust pool settings through the Ceph CLI. However, for any changes inmetadataPool, we recommend re-creating CephFS. -
preserveFilesystemOnDelete- Defines whether to delete the data and metadata pools if CephFS is deleted. Set totrueto avoid occasional data loss in case of human error. However, for security reasons, we recommend settingpreserveFilesystemOnDeletetofalse. -
metadataServer- Metadata Server settings correspond to the Ceph MDS daemon settings. Contains the following fields:activeCount- the number of active Ceph MDS instances. As a load increases, CephFS will automatically partition the file system across the Ceph MDS instances. Rook will create double the number of Ceph MDS instances as requested byactiveCount. The extra instances will be in the standby mode for failover. We recommend specifying this parameter to1and increasing the MDS daemons count only in case of a high load.activeStandby- defines whether the extra Ceph MDS instances will be in active standby mode and will keep a warm cache of the file system metadata for faster failover. CephFS will assign the instances in failover pairs. Iffalse, the extra Ceph MDS instances will all be in passive standby mode and will not maintain a warm cache of the metadata. The default value isfalse.resources- represents Kubernetes resource requirements for Ceph MDS pods.
For example:
spec: sharedFilesystem: cephFS: - name: cephfs-store metadataServer: activeCount: 1 activeStandby: false resources: # example, non-prod values requests: memory: 1Gi cpu: 1 limits: memory: 2Gi cpu: 2
Configure CephFS#
-
Optional. Override the CSI CephFS gRPC and liveness metrics port. For example, if an application is already using the default CephFS ports
9092and9082, which may cause conflicts on the node. Upgrade Pelagia Helm release values with desired port numbers:helm upgrade --install pelagia-ceph oci://registry.mirantis.com/pelagia/pelagia-ceph --version 1.0.0 -n pelagia \ --set rookConfig.csiCephFsGPCMetricsPort=<desiredPort>,rookConfig.csiCephFsLivenessMetricsPort=<desiredPort>Rook will enable the CephFS CSI plugin and provisioner.
-
Open the
CephDeploymentCR for editing:kubectl -n pelagia edit cephdpl -
In the
sharedFilesystemsection, specify parameters according to CephFS specification. For example:spec: sharedFilesystem: cephFS: - name: cephfs-store dataPools: - name: cephfs-pool-1 deviceClass: hdd replicated: size: 3 failureDomain: host metadataPool: deviceClass: nvme replicated: size: 3 failureDomain: host metadataServer: activeCount: 1 activeStandby: false -
Define the
mdsrole for the corresponding nodes where Ceph MDS daemons should be deployed. We recommend labeling only one node with themdsrole. For example:spec: nodes: ... worker-1: roles: ... - mds
Once CephFS is specified in the CephDeployment CR, Pelagia Deployment Controller will validate it and
request Rook to create CephFS. Then Pelagia Deployment Controller will create a Kubernetes StorageClass,
required to start provisioning the storage, which will operate the CephFS CSI driver to create Kubernetes PVs.