您当前的位置:首页 > 学海无涯 > 应用搭建网站首页应用搭建
rook-ceph部署使用说明
发布时间:2021-05-24作者:♂逸風★淩軒
1、下载源码
git clone https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph kubectl create -f common.yaml kubectl create -f operator.yaml kubectl create -f cluster.yaml
对于1.7版本以上则需要执行如下
cd rook/cluster/examples/kubernetes/ceph kubectl create -f crds.yaml -f common.yaml -f operator.yaml kubectl create -f cluster.yaml
2、删除Ceph集群
如果要删除已创建的Ceph集群,可执行下面命令:
# kubectl delete -f cluster.yaml
删除Ceph集群后,在之前部署Ceph组件节点的/var/lib/rook/目录,会遗留下Ceph集群的配置信息。\
若之后再部署新的Ceph集群,先把之前Ceph集群的这些信息删除,不然启动monitor会失败;
# cat clean-rook-dir.sh
hosts=( k8s-master k8s-node1 k8s-node2 ) for host in ${hosts[@]} ; do ssh $host "rm -rf /var/lib/rook/*" done
3、配置ceph dashboard
在cluster.yaml文件中默认已经启用了ceph dashboard,查看dashboard的service:
[centos@k8s-master ~]$ kubectl get service -n rook-ceph NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE rook-ceph-mgr ClusterIP 10.107.77.188 <none> 9283/TCP 3h33m rook-ceph-mgr-dashboard ClusterIP 10.96.135.98 <none> 8443/TCP 3h33m rook-ceph-mon-a ClusterIP 10.105.153.93 <none> 6790/TCP 3h35m rook-ceph-mon-b ClusterIP 10.105.107.254 <none> 6790/TCP 3h34m rook-ceph-mon-c ClusterIP 10.104.1.238 <none> 6790/TCP 3h34m
rook-ceph-mgr-dashboard监听的端口是8443,创建nodeport类型的service以便集群外部访问。
kubectl apply -f rook/cluster/examples/kubernetes/ceph/dashboard-external-https.yaml
查看一下nodeport暴露的端口,这里是32483端口:
[centos@k8s-master ~]$ kubectl get service -n rook-ceph | grep dashboard rook-ceph-mgr-dashboard ClusterIP 10.96.135.98 <none> 8443/TCP 3h37m rook-ceph-mgr-dashboard-external-https NodePort 10.97.181.103 <none> 8443:32483/TCP 3h29m
获取admin的密码
$ kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath='{.data.password}' | base64 --decode
4、 部署 Rook toolbox 并测试
我们知道 Ceph 是有 CLI 工具来查看集群信息的,但是默认启动的 Ceph 集群时开启了 `cephx` 认证,登录 Ceph 各组件所在的 Pod 是没法执行 CLI 命令的,我们来实验一下:
$ kubectl -n rook-ceph get pod | grep rook-ceph-mon
rook-ceph-mon-a-56bfd58fdd-sql6w 1/1 Running 0 43m rook-ceph-mon-b-68df678588-djj5v 1/1 Running 0 43m rook-ceph-mon-c-65d8945f5-n4qsv 1/1 Running 0 42m
4.1、# 登录任意一个 Pod,执行 Ceph 命令
$ kubectl -n rook-ceph exec -it rook-ceph-mon-a-56bfd58fdd-sql6w bash
[root@rook-ceph-mon-a-56bfd58fdd-sql6w /]# ceph status 2019-01-07 15:06:03.720 7f7df9d4b700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2019-01-07 15:06:03.720 7f7df9d4b700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory 2019-01-07 15:06:03.722 7f7df9d4b700 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication 2019-01-07 15:06:03.722 7f7df9d4b700 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication [errno 95] error connecting to the cluster
从日志可以看到,`auth: unable to find a keyring on /etc/ceph/`,是没有 `cephx` 认证文件的。此时需要部署一个 `Ceph toolbox`,`toolbox` 以容器的方式在 K8S 内运行,它包含了 `cephx` 认证文件以及各种 `Ceph clients` 工具的。我们可以用它来执行一些 Ceph 相关的测试或调试操作。
$ kubectl create -f toolbox.yaml
$ kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
[root@node2 /]# ceph status
cluster: id: 6c117372-a462-447c-bfd4-a0378393f69e health: HEALTH_WARN clock skew detected on mon.a, mon.c services: mon: 3 daemons, quorum b,a,c mgr: a(active) osd: 3 osds: 3 up, 3 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 21 GiB used, 75 GiB / 96 GiB avail pgs:
[root@node2 /]# ceph df
GLOBAL: SIZE AVAIL RAW USED %RAW USED 96 GiB 75 GiB 21 GiB 21.52 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS
[root@node2 /]# ceph osd status
+----+----------------------------------+-------+-------+--------+---------+--------+---------+-----------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+----------------------------------+-------+-------+--------+---------+--------+---------+-----------+ | 0 | rook-ceph-osd-0-6ccbd4dc4d-2w9jt | 7893M | 24.2G | 0 | 0 | 0 | 0 | exists,up | | 1 | rook-ceph-osd-1-647cbb4b84-65ttr | 5148M | 26.9G | 0 | 0 | 0 | 0 | exists,up | | 2 | rook-ceph-osd-2-7b8ff9fc47-g8l6q | 8096M | 24.0G | 0 | 0 | 0 | 0 | exists,up | +----+----------------------------------+-------+-------+--------+---------+--------+---------+-----------+
[root@node2 /]# rados df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR total_objects 0 total_used 21 GiB total_avail 75 GiB total_space 96 GiB
4.2、 创建一个新的 pool
[root@node2 /]# ceph osd pool create test_pool 64
pool 'test_pool' created
[root@node2 /]# ceph osd pool get test_pool size
size: 1
再次执行 ceph df 查看是否显示创建的 pool
[root@node2 /]# ceph df
GLOBAL: SIZE AVAIL RAW USED %RAW USED 96 GiB 75 GiB 21 GiB 21.52 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS test_pool 1 0 B 0 67 GiB 0
可以看到在 `toolbox` 内部可以执行 CLI 相关命令,此时我们在 Dashboard 上就可以看到创建的 pool 了。
# 可在 ceph 服务端通过以下命令获得 secret key
ceph auth get-key client.admin
4.3、 创建 admin secret
kubectl create secret generic ceph-admin-secret --namespace=kube-system --type=kubernetes.io/rbd --from-literal=key=AQCtabcdKvXBORAA234AREkmsrmLdY67i8vxSQ==
4.4、基本命令
# 创建一个用于cephfs数据存储的池,相关参数自行参阅社区文档,一言难尽。。。 ceph osd pool create cephfs_data 64 64 # 创建一个用于cephfs元数据存储的池 ceph osd pool create cephfs_metadata 32 32 # 创建一个新的fs服务,名为cephfs ceph fs new cephfs cephfs_metadata cephfs_data # 查看集群当前的fs服务 ceph fs ls # 设置cephfs最大mds服务数量 ceph fs set cephfs max_mds 4 # 部署4个mds服务 ceph orch apply mds cephfs --placement="4 manager01.xxx.com manager02.xxx.com worker01.xxx.com worker02.xxx.com" # 查看mds服务是否部署成功 ceph orch ps --daemon-type mds
4.5、出现1 filesystem is online with fewer MDS than max_mds
的警告
# 在所有节点上执行 systemctl restart ceph.target # 重启之后检查相关服务 ceph orch ps --daemon-type mds ceph osd lspools ceph fs ls
5、创建默认pool
创建ceph-rbd-pool.yaml文件
apiVersion: ceph.rook.io/v1 kind: CephBlockPool metadata: name: replicapool # operator会监听并创建一个pool,执行完后界面上也能看到对应的pool namespace: rook-ceph spec: failureDomain: host replicated: size: 3 --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: rook-ceph-block # 这里创建一个storage class, 在pvc中指定这个storage class即可实现动态创建PV provisioner: ceph.rook.io/block parameters: blockPool: replicapool # The value of "clusterNamespace" MUST be the same as the one in which your rook cluster exist clusterNamespace: rook-ceph # Specify the filesystem type of the volume. If not specified, it will use `ext4`. fstype: xfs # Optional, default reclaimPolicy is "Delete". Other options are: "Retain", "Recycle" as documented in https://kubernetes.io/docs/concepts/storage/storage-classes/ reclaimPolicy: Retain #添加动态扩容 allowVolumeExpansion: true
#kubectl apply -f ceph-rbd-pool.yaml #kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
6、创建默认存储类
6.1、安装rbd插件storageclass
kubectl apply -f /opt/k8s-install-tool/rook-ceph/rook/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml
6.2、安装cephfs元数据存储池及插件storageclass
kubectl apply -f /opt/k8s-install-tool/rook-ceph/rook/cluster/examples/kubernetes/ceph/filesystem.yaml kubectl apply -f /opt/k8s-install-tool/rook-ceph/rook/cluster/examples/kubernetes/ceph/csi/cephfs/storageclass.yaml
7、部署 Ceph Monitoring Prometheus 监控
服务起来了,我们需要实时监控它,这里可以选择 Prometheus 来作为监控组件,部署 Prometheus 可以采用 `Prometheus Operator` 来部署。
7.1、部署 Prometheus Operator
首先需要部署 `Prometheus Operator`,部署完毕后,直到 `prometheus-operator` Pod 状态为 `Running` 时,才能执行下一步操作。
$ kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/v0.26.0/bundle.yaml
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created clusterrole.rbac.authorization.k8s.io/prometheus-operator created deployment.apps/prometheus-operator created serviceaccount/prometheus-operator created
# kubectl get pod
NAME READY STATUS RESTARTS AGE prometheus-operator-544f649-bhfxz 1/1 Running 0 13m
7.2、部署 Prometheus 实例
Rook 为我们提供好了部署 Prometheus 实例的 yaml 文件。
$ cd cluster/examples/kubernetes/ceph/monitoring $ kubectl create -f ./
service/rook-prometheus created serviceaccount/prometheus created clusterrole.rbac.authorization.k8s.io/prometheus created clusterrolebinding.rbac.authorization.k8s.io/prometheus created prometheus.monitoring.coreos.com/rook-prometheus created servicemonitor.monitoring.coreos.com/rook-ceph-mgr created
$ kubectl -n rook-ceph get pod |grep prometheus-rook
prometheus-rook-prometheus-0 3/3 Running 1 10m
等到 `prometheus-rook-prometheus-0` 状态为 `Running` 后,我们就可以访问它了。
7.3、访问 Prometheus Dashboard
Prometheus Dashboard 服务启动后,默认暴漏服务类型为 `NodePort`,端口号为 `30900`,那么我们可以通过访问 `http://<Cluster_IP>:30900` 页面,这里我可以通过 `http://10.222.78.63:30900` 地址访问。
$ kubectl -n rook-ceph get svc |grep rook-prometheus
rook-prometheus NodePort 10.68.152.50 <none> 9090:30900/TCP 13m
验证命令
ceph status ceph osd status ceph dfrados df
8、知识点
8.1、pv的三种访问模式
ReadWriteOnce,RWO,仅可被单个节点读写挂载
ReadOnlyMany,ROX,可被多节点同时只读挂载
ReadWriteMany,RWX,可被多节点同时读写挂载
8.2、pv回收策略
Retain,保持不动,由管理员手动回收
Recycle,空间回收,删除所有文件,仅NFS和hostPath支持
Delete,删除存储卷,仅部分云端存储支持
8.3、获取replicapool 副本数
ceph osd pool get replicapool size
修改副本数:
ceph osd pool set replicapool size 2
8.4、删除pools
#ceph osd pool delete replicapool replicapool --yes-i-really-really-mean-it pool 'replicapool' removed
8.5、物理删除节点
#dd if=/dev/zero of="/dev/sdb" bs=1M count=100 oflag=direct,dsync
关键字词:rook-ceph,安装

相关文章
-
无相关信息