应用搭建网站首页 应用搭建

rook-ceph部署使用说明

发布时间：2021-05-24作者：♂逸風★淩軒

1、下载源码

git clone https://github.com/rook/rook.git

cd rook/cluster/examples/kubernetes/ceph
kubectl create -f common.yaml
kubectl create -f operator.yaml
kubectl create -f cluster.yaml

对于1.7版本以上则需要执行如下

cd rook/cluster/examples/kubernetes/ceph
kubectl create -f crds.yaml -f common.yaml -f operator.yaml
kubectl create -f cluster.yaml

2、删除Ceph集群
如果要删除已创建的Ceph集群，可执行下面命令：

# kubectl delete -f cluster.yaml

删除Ceph集群后，在之前部署Ceph组件节点的/var/lib/rook/目录，会遗留下Ceph集群的配置信息。\
若之后再部署新的Ceph集群，先把之前Ceph集群的这些信息删除，不然启动monitor会失败；

# cat clean-rook-dir.sh

hosts=(
  k8s-master
  k8s-node1
  k8s-node2
)
for host in ${hosts[@]} ; do
  ssh $host "rm -rf /var/lib/rook/*"
done

3、配置ceph dashboard
在cluster.yaml文件中默认已经启用了ceph dashboard，查看dashboard的service：

[centos@k8s-master ~]$ kubectl get service -n rook-ceph
NAME               TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
rook-ceph-mgr           ClusterIP         10.107.77.188          <none>   9283/TCP         3h33m
rook-ceph-mgr-dashboard     ClusterIP         10.96.135.98          <none>   8443/TCP         3h33m
rook-ceph-mon-a         ClusterIP         10.105.153.93          <none>   6790/TCP         3h35m
rook-ceph-mon-b         ClusterIP         10.105.107.254          <none>   6790/TCP         3h34m
rook-ceph-mon-c         ClusterIP         10.104.1.238          <none>   6790/TCP         3h34m

rook-ceph-mgr-dashboard监听的端口是8443，创建nodeport类型的service以便集群外部访问。

kubectl apply -f rook/cluster/examples/kubernetes/ceph/dashboard-external-https.yaml

查看一下nodeport暴露的端口，这里是32483端口：

[centos@k8s-master ~]$ kubectl get service -n rook-ceph | grep dashboard
rook-ceph-mgr-dashboard                  ClusterIP   10.96.135.98     <none>        8443/TCP         3h37m
rook-ceph-mgr-dashboard-external-https   NodePort    10.97.181.103    <none>        8443:32483/TCP   3h29m

获取admin的密码

$ kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath='{.data.password}'  |  base64 --decode

4、部署 Rook toolbox 并测试

我们知道 Ceph 是有 CLI 工具来查看集群信息的，但是默认启动的 Ceph 集群时开启了 `cephx` 认证，登录 Ceph 各组件所在的 Pod 是没法执行 CLI 命令的，我们来实验一下：

$ kubectl -n rook-ceph get pod | grep rook-ceph-mon

rook-ceph-mon-a-56bfd58fdd-sql6w           1/1     Running     0          43m
rook-ceph-mon-b-68df678588-djj5v           1/1     Running     0          43m
rook-ceph-mon-c-65d8945f5-n4qsv            1/1     Running     0          42m

4.1、# 登录任意一个 Pod，执行 Ceph 命令

$ kubectl -n rook-ceph exec -it rook-ceph-mon-a-56bfd58fdd-sql6w bash

[root@rook-ceph-mon-a-56bfd58fdd-sql6w /]# ceph status
2019-01-07 15:06:03.720 7f7df9d4b700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2019-01-07 15:06:03.720 7f7df9d4b700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2019-01-07 15:06:03.722 7f7df9d4b700 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication
2019-01-07 15:06:03.722 7f7df9d4b700 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication
[errno 95] error connecting to the cluster

从日志可以看到，`auth: unable to find a keyring on /etc/ceph/`，是没有 `cephx` 认证文件的。此时需要部署一个 `Ceph toolbox`，`toolbox` 以容器的方式在 K8S 内运行，它包含了 `cephx` 认证文件以及各种 `Ceph clients` 工具的。我们可以用它来执行一些 Ceph 相关的测试或调试操作。

$ kubectl create -f toolbox.yaml

$ kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

[root@node2 /]# ceph status

  cluster:
    id:     6c117372-a462-447c-bfd4-a0378393f69e
    health: HEALTH_WARN
            clock skew detected on mon.a, mon.c
  services:
    mon: 3 daemons, quorum b,a,c
    mgr: a(active)
    osd: 3 osds: 3 up, 3 in
  data:
    pools:   0 pools, 0 pgs
    objects: 0  objects, 0 B
    usage:   21 GiB used, 75 GiB / 96 GiB avail
    pgs:

[root@node2 /]# ceph df

GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED 
    96 GiB     75 GiB       21 GiB         21.52 
POOLS:
    NAME     ID     USED     %USED     MAX AVAIL     OBJECTS

[root@node2 /]# ceph osd status

+----+----------------------------------+-------+-------+--------+---------+--------+---------+-----------+
| id |               host               |  used | avail | wr ops | wr data | rd ops | rd data |   state   |
+----+----------------------------------+-------+-------+--------+---------+--------+---------+-----------+
| 0  | rook-ceph-osd-0-6ccbd4dc4d-2w9jt | 7893M | 24.2G |    0   |     0   |    0   |     0   | exists,up |
| 1  | rook-ceph-osd-1-647cbb4b84-65ttr | 5148M | 26.9G |    0   |     0   |    0   |     0   | exists,up |
| 2  | rook-ceph-osd-2-7b8ff9fc47-g8l6q | 8096M | 24.0G |    0   |     0   |    0   |     0   | exists,up |
+----+----------------------------------+-------+-------+--------+---------+--------+---------+-----------+

[root@node2 /]# rados df

POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR 
total_objects    0
total_used       21 GiB
total_avail      75 GiB
total_space      96 GiB

4.2、创建一个新的 pool

[root@node2 /]# ceph osd pool create test_pool 64

pool 'test_pool' created

[root@node2 /]# ceph osd pool get test_pool size

size: 1
再次执行 ceph df 查看是否显示创建的 pool

[root@node2 /]# ceph df

GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED 
    96 GiB     75 GiB       21 GiB         21.52 
POOLS:
    NAME          ID     USED     %USED     MAX AVAIL     OBJECTS 
    test_pool     1       0 B         0        67 GiB           0

可以看到在 `toolbox` 内部可以执行 CLI 相关命令，此时我们在 Dashboard 上就可以看到创建的 pool 了。

# 可在 ceph 服务端通过以下命令获得 secret key

ceph auth get-key client.admin

4.3、创建 admin secret

kubectl create secret generic ceph-admin-secret --namespace=kube-system --type=kubernetes.io/rbd --from-literal=key=AQCtabcdKvXBORAA234AREkmsrmLdY67i8vxSQ==

4.4、基本命令

# 创建一个用于cephfs数据存储的池，相关参数自行参阅社区文档，一言难尽。。。
ceph osd pool create cephfs_data 64 64
# 创建一个用于cephfs元数据存储的池
ceph osd pool create cephfs_metadata 32 32
# 创建一个新的fs服务，名为cephfs
ceph fs new cephfs cephfs_metadata cephfs_data
# 查看集群当前的fs服务
ceph fs ls
# 设置cephfs最大mds服务数量
ceph fs set cephfs max_mds 4
# 部署4个mds服务
ceph orch apply mds cephfs --placement="4 manager01.xxx.com manager02.xxx.com worker01.xxx.com worker02.xxx.com"
# 查看mds服务是否部署成功
ceph orch ps --daemon-type mds

4.5、出现1 filesystem is online with fewer MDS than max_mds的警告

# 在所有节点上执行
systemctl restart ceph.target
# 重启之后检查相关服务
ceph orch ps --daemon-type mds
ceph osd lspools
ceph fs ls

5、创建默认pool

创建ceph-rbd-pool.yaml文件

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool   # operator会监听并创建一个pool，执行完后界面上也能看到对应的pool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block    # 这里创建一个storage class, 在pvc中指定这个storage class即可实现动态创建PV
provisioner: ceph.rook.io/block
parameters:
  blockPool: replicapool
  # The value of "clusterNamespace" MUST be the same as the one in which your rook cluster exist
  clusterNamespace: rook-ceph
  # Specify the filesystem type of the volume. If not specified, it will use `ext4`.
  fstype: xfs
# Optional, default reclaimPolicy is "Delete". Other options are: "Retain", "Recycle" as documented in https://kubernetes.io/docs/concepts/storage/storage-classes/
reclaimPolicy: Retain
#添加动态扩容
allowVolumeExpansion: true

#kubectl apply -f ceph-rbd-pool.yaml
#kubectl patch storageclass  rook-ceph-block  -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

6、创建默认存储类

6.1、安装rbd插件storageclass

 kubectl  apply -f /opt/k8s-install-tool/rook-ceph/rook/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml

6.2、安装cephfs元数据存储池及插件storageclass

kubectl  apply -f /opt/k8s-install-tool/rook-ceph/rook/cluster/examples/kubernetes/ceph/filesystem.yaml
kubectl  apply -f /opt/k8s-install-tool/rook-ceph/rook/cluster/examples/kubernetes/ceph/csi/cephfs/storageclass.yaml

7、部署 Ceph Monitoring Prometheus 监控
服务起来了，我们需要实时监控它，这里可以选择 Prometheus 来作为监控组件，部署 Prometheus 可以采用 `Prometheus Operator` 来部署。
7.1、部署 Prometheus Operator

首先需要部署 `Prometheus Operator`，部署完毕后，直到 `prometheus-operator` Pod 状态为 `Running` 时，才能执行下一步操作。

$ kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/v0.26.0/bundle.yaml

clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
serviceaccount/prometheus-operator created

# kubectl get pod

NAME                 READY    STATUS    RESTARTS          AGE
prometheus-operator-544f649-bhfxz   1/1     Running   0          13m

7.2、部署 Prometheus 实例

Rook 为我们提供好了部署 Prometheus 实例的 yaml 文件。

$ cd cluster/examples/kubernetes/ceph/monitoring
$ kubectl create -f ./

service/rook-prometheus created
serviceaccount/prometheus created
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
prometheus.monitoring.coreos.com/rook-prometheus created
servicemonitor.monitoring.coreos.com/rook-ceph-mgr created

$ kubectl -n rook-ceph get pod |grep prometheus-rook

prometheus-rook-prometheus-0               3/3     Running     1          10m

等到 `prometheus-rook-prometheus-0` 状态为 `Running` 后，我们就可以访问它了。

7.3、访问 Prometheus Dashboard

Prometheus Dashboard 服务启动后，默认暴漏服务类型为 `NodePort`，端口号为 `30900`，那么我们可以通过访问 `http://<Cluster_IP>:30900` 页面，这里我可以通过 `http://10.222.78.63:30900` 地址访问。

$ kubectl -n rook-ceph get svc |grep rook-prometheus

rook-prometheus    NodePort    10.68.152.50    <none>        9090:30900/TCP   13m

验证命令

ceph status
ceph osd status
ceph dfrados df

8、知识点

8.1、pv的三种访问模式
ReadWriteOnce，RWO，仅可被单个节点读写挂载
ReadOnlyMany，ROX，可被多节点同时只读挂载
ReadWriteMany，RWX，可被多节点同时读写挂载

8.2、pv回收策略
Retain，保持不动，由管理员手动回收
Recycle，空间回收，删除所有文件，仅NFS和hostPath支持
Delete，删除存储卷，仅部分云端存储支持

8.3、获取replicapool 副本数

ceph osd pool get replicapool   size

修改副本数：

ceph osd pool set replicapool size 2

8.4、删除pools

#ceph osd pool delete replicapool  replicapool  --yes-i-really-really-mean-it
pool 'replicapool' removed

8.5、物理删除节点

#dd if=/dev/zero of="/dev/sdb" bs=1M count=100 oflag=direct,dsync

关键字词：rook-ceph,安装

上一篇：Centos7安装Postgresql并安装FDW、PostGIS组件

下一篇：DNS Bind的基础搭建

无相关信息

您当前的位置：首页 > 学海无涯 > 应用搭建网站首页应用搭建

rook-ceph部署使用说明

相关文章