使用rook編排ceph

    在k8s上編排ceph是容器生態存儲方案的一個趨勢,能非常簡單快速的構建出存儲集羣,特別適合供有狀態服務使用,計算存儲分離將使應用的管理變簡單,業務層與雲操作系統層也能更好的解耦。


本文中需要用的yaml文件和Dockerfile等都放到這個倉庫 包含:rook operator ceph cluster storage class配置,mysql wordpress事例,性能測試fio Dockerfile與yaml等

安裝


git clone https://github.com/rook/rook

cd cluster/examples/kubernetes/ceph
kubectl create -f operator.yaml

查看operator是否成功:


[root@dev-86-201 ~]# kubectl get pod -n rook-ceph-system
NAME                                  READY   STATUS    RESTARTS   AGE
rook-ceph-agent-5z6p7                 1/1     Running   0          88m
rook-ceph-agent-6rj7l                 1/1     Running   0          88m
rook-ceph-agent-8qfpj                 1/1     Running   0          88m
rook-ceph-agent-xbhzh                 1/1     Running   0          88m
rook-ceph-operator-67f4b8f67d-tsnf2   1/1     Running   0          88m
rook-discover-5wghx                   1/1     Running   0          88m
rook-discover-lhwvf                   1/1     Running   0          88m
rook-discover-nl5m2                   1/1     Running   0          88m
rook-discover-qmbx7                   1/1     Running   0          88m

然後創建ceph集羣:

1
kubectl create -f cluster.yaml

查看ceph集羣:


[root@dev-86-201 ~]# kubectl get pod -n rook-ceph
NAME                               READY   STATUS    RESTARTS   AGE
rook-ceph-mgr-a-8649f78d9b-jklbv   1/1     Running   0          64m
rook-ceph-mon-a-5d7fcfb6ff-2wq9l   1/1     Running   0          81m
rook-ceph-mon-b-7cfcd567d8-lkqff   1/1     Running   0          80m
rook-ceph-mon-d-65cd79df44-66rgz   1/1     Running   0          79m
rook-ceph-osd-0-56bd7545bd-5k9xk   1/1     Running   0          63m
rook-ceph-osd-1-77f56cd549-7rm4l   1/1     Running   0          63m
rook-ceph-osd-2-6cf58ddb6f-wkwp6   1/1     Running   0          63m
rook-ceph-osd-3-6f8b78c647-8xjzv   1/1     Running   0          63m

參數說明:


apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    # For the latest ceph images, see https://hub.docker.com/r/ceph/ceph/tags
    image: ceph/ceph:v13.2.2-20181023
  dataDirHostPath: /var/lib/rook # 數據盤目錄
  mon:
    count: 3
    allowMultiplePerNode: true
  dashboard:
    enabled: true
  storage:
    useAllNodes: true
    useAllDevices: false
    config:
      databaseSizeMB: "1024"
      journalSizeMB: "1024"

訪問ceph dashboard:


[root@dev-86-201 ~]# kubectl get svc -n rook-ceph
NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
rook-ceph-mgr             ClusterIP   10.98.183.33     <none>        9283/TCP         66m
rook-ceph-mgr-dashboard   NodePort    10.103.84.48     <none>        8443:31631/TCP   66m  # 把這個改成NodePort模式
rook-ceph-mon-a           ClusterIP   10.99.71.227     <none>        6790/TCP         83m
rook-ceph-mon-b           ClusterIP   10.110.245.119   <none>        6790/TCP         82m
rook-ceph-mon-d           ClusterIP   10.101.79.159    <none>        6790/TCP         81m

然後訪問https://10.1.86.201:31631 即可 

管理賬戶admin,獲取登錄密碼:

1
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o yaml | grep "password:" | awk '{print $2}' | base64 --decode

使用

創建pool


apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool   # operator會監聽並創建一個pool,執行完後界面上也能看到對應的pool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block    # 這裏創建一個storage class, 在pvc中指定這個storage class即可實現動態創建PV
provisioner: ceph.rook.io/block
parameters:
  blockPool: replicapool
  # The value of "clusterNamespace" MUST be the same as the one in which your rook cluster exist
  clusterNamespace: rook-ceph
  # Specify the filesystem type of the volume. If not specified, it will use `ext4`.
  fstype: xfs
# Optional, default reclaimPolicy is "Delete". Other options are: "Retain", "Recycle" as documented in https://kubernetes.io/docs/concepts/storage/storage-classes/
reclaimPolicy: Retain

創建pvc

在cluster/examples/kubernetes 目錄下,官方給了個worldpress的例子,可以直接運行一下:


kubectl create -f mysql.yaml
kubectl create -f wordpress.yaml

查看PV PVC:


[root@dev-86-201 ~]# kubectl get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
mysql-pv-claim   Bound    pvc-a910f8c2-1ee9-11e9-84fc-becbfc415cde   20Gi       RWO            rook-ceph-block   144m
wp-pv-claim      Bound    pvc-af2dfbd4-1ee9-11e9-84fc-becbfc415cde   20Gi       RWO            rook-ceph-block   144m

[root@dev-86-201 ~]# kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                    STORAGECLASS      REASON   AGE
pvc-a910f8c2-1ee9-11e9-84fc-becbfc415cde   20Gi       RWO            Retain           Bound    default/mysql-pv-claim   rook-ceph-block            145m
pvc-af2dfbd4-1ee9-11e9-84fc-becbfc415cde   20Gi       RWO            Retain           Bound    default/wp-pv-claim      rook-ceph-block            145m

看下yaml文件:


apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pv-claim
  labels:
    app: wordpress
spec:
  storageClassName: rook-ceph-block   # 指定storage class
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi  # 需要一個20G的盤

...

        volumeMounts:
        - name: mysql-persistent-storage
          mountPath: /var/lib/mysql
      volumes:
      - name: mysql-persistent-storage
        persistentVolumeClaim:
          claimName: mysql-pv-claim  # 指定上面定義的PVC

是不是非常簡單。

要訪問wordpress的話請把service改成NodePort類型,官方給的是loadbalance類型:


kubectl edit svc wordpress

[root@dev-86-201 kubernetes]# kubectl get svc
NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
wordpress         NodePort    10.109.30.99   <none>        80:30130/TCP   148m

外部訪問ceph集羣

cluster.yaml裏有配置,可配置成共享宿主機網絡,這樣外面可直接連接ceph集羣:


network:
    # toggle to use hostNetwork
    hostNetwork: false

ceph集羣監控

通過prometheus operator配合rook可以快速構建ceph集羣的監控,sealyun安裝包中已經自帶了prometheus operator,所以直接幹即可

啓動ceph prometheus

注意這裏是爲ceph單獨起了一個prometheus,這樣做挺好,因爲畢竟可以緩解prometheus單點的壓力


cd cluster/examples/kubernetes/ceph/monitoring
kubectl create -f service-monitor.yaml
kubectl create -f prometheus.yaml
kubectl create -f prometheus-service.yaml

然後我們的grafana在30000端口,先在grafana上添加數據源

數據源要配置成:

1
http://rook-prometheus.rook-ceph.svc.cluster.local:9090

導入dashboard

  

還有幾個別的dashboard可以導入: Ceph - Cluster Ceph - OSD Ceph - Pools

再次感嘆生態之強大

增加節點,刪除節點

1
kubectl edit cephcluster rook-ceph -n rook-ceph

把useAllNodes設置成false,然後在nodes列表裏增加節點名即可,保存退出後會自動增加ceph節點


nodes:
    - config: null
      name: izj6c3afuzdjhtkj156zt0z
      resources: {}
    - config: null
      name: izj6cg4wnagly61eny0hy9z
      resources: {}
    - config: null
      name: izj6cg4wnagly61eny0hyaz
      resources: {}
    useAllDevices: false

刪除同理,直接edit刪除即可,十分強大

性能測試

這裏着重說明測試方法,給出在我的場景下的測試結果,用戶應當根據自己的場景進行自己的測試。

測試環境

我這裏使用阿里雲服務器,掛載了一個1000G的磁盤,不過我實際測試時用的盤很少(盤大太慢)


[root@izj6c3afuzdjhtkj156zt0z ~]# df -h|grep dev
devtmpfs        3.9G     0  3.9G    0% /dev
tmpfs           3.9G     0  3.9G    0% /dev/shm
/dev/vda1        40G   17G   21G   44% /
/dev/vdb1       985G   41G  895G    5% /data   
/dev/rbd0       296G  2.1G  294G    1% /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph-system/mounts/pvc-692e2be3-2434-11e9-aef7-00163e01e813

可看到一個本地盤和一個ceph掛過來的盤

先搞個fio的鏡像


[root@izj6c3afuzdjhtkj156zt0z fio]# cat Dockerfile
FROM docker.io/centos:7.6.1810
RUN yum install -y fio

測試容器,稍微改一下mysql的yaml:


apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pv-claim
  labels:
    app: wordpress
spec:
  storageClassName: rook-ceph-block
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 300Gi
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: wordpress-mysql
  labels:
    app: wordpress
spec:
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: wordpress
        tier: mysql
    spec:
      containers:
      - image: fio:latest
        imagePullPolicy: IfNotPresent
        command: ["sleep", "10000000"]
        name: mysql
        volumeMounts:
        - name: mysql-persistent-storage
          mountPath: /var/lib/mysql
      volumes:
      - name: mysql-persistent-storage
        persistentVolumeClaim:
          claimName: mysql-pv-claim

測試過程

進入到容器內跑測試工具:


$ kubectl exec -it wordpress-mysql-775c44887c-5vhx9 bash

# touch /var/lib/mysql/test # 創建測試文件
# fio -filename=/var/lib/mysql/test -direct=1 -iodepth=128 -rw=randrw -ioengine=libaio -bs=4k -size=200G -numjobs=8 -runtime=100 -group_reporting -name=Rand_Write_Testing

結果:


Rand_Write_Testing: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.1
Starting 2 processes
Rand_Write_Testing: Laying out IO file (1 file / 2048MiB)
Jobs: 2 (f=2): [m(2)][100.0%][r=16.6MiB/s,w=17.2MiB/s][r=4240,w=4415 IOPS][eta 00m:00s]
Rand_Write_Testing: (groupid=0, jobs=2): err= 0: pid=34: Wed Jan 30 02:44:18 2019
   read: IOPS=3693, BW=14.4MiB/s (15.1MB/s)(1443MiB/100013msec)
    slat (usec): min=2, max=203952, avg=262.47, stdev=2107.16
    clat (msec): min=3, max=702, avg=30.85, stdev=30.97
     lat (msec): min=3, max=702, avg=31.11, stdev=31.21
    clat percentiles (msec):
     |  1.00th=[   12],  5.00th=[   14], 10.00th=[   16], 20.00th=[   18],
     | 30.00th=[   20], 40.00th=[   22], 50.00th=[   24], 60.00th=[   27],
     | 70.00th=[   30], 80.00th=[   36], 90.00th=[   46], 95.00th=[   64],
     | 99.00th=[  194], 99.50th=[  213], 99.90th=[  334], 99.95th=[  397],
     | 99.99th=[  502]
   bw (  KiB/s): min=  440, max=12800, per=49.98%, avg=7383.83, stdev=3004.90, samples=400
   iops        : min=  110, max= 3200, avg=1845.92, stdev=751.22, samples=400
  write: IOPS=3700, BW=14.5MiB/s (15.2MB/s)(1446MiB/100013msec)
    slat (usec): min=2, max=172409, avg=266.11, stdev=2004.53
    clat (msec): min=7, max=711, avg=37.85, stdev=37.52
     lat (msec): min=7, max=711, avg=38.12, stdev=37.72
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   19], 10.00th=[   21], 20.00th=[   23],
     | 30.00th=[   25], 40.00th=[   27], 50.00th=[   29], 60.00th=[   32],
     | 70.00th=[   36], 80.00th=[   42], 90.00th=[   53], 95.00th=[   73],
     | 99.00th=[  213], 99.50th=[  292], 99.90th=[  397], 99.95th=[  472],
     | 99.99th=[  600]
   bw (  KiB/s): min=  536, max=12800, per=49.98%, avg=7397.37, stdev=3000.10, samples=400
   iops        : min=  134, max= 3200, avg=1849.32, stdev=750.02, samples=400
  lat (msec)   : 4=0.01%, 10=0.22%, 20=22.18%, 50=68.05%, 100=5.90%
  lat (msec)   : 250=3.09%, 500=0.54%, 750=0.02%
  cpu          : usr=1.63%, sys=4.68%, ctx=311249, majf=0, minf=18
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=369395,370084,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=14.4MiB/s (15.1MB/s), 14.4MiB/s-14.4MiB/s (15.1MB/s-15.1MB/s), io=1443MiB (1513MB), run=100013-100013msec
  WRITE: bw=14.5MiB/s (15.2MB/s), 14.5MiB/s-14.5MiB/s (15.2MB/s-15.2MB/s), io=1446MiB (1516MB), run=100013-100013msec

Disk stats (read/write):
  rbd0: ios=369133/369841, merge=0/35, ticks=4821508/7373587, in_queue=12172273, util=99.93%

在宿主機上測試:


$ touch /data/test
$ fio -filename=/data/test -direct=1 -iodepth=128 -rw=randrw -ioengine=libaio -bs=4k -size=2G -numjobs=2 -runtime=100 -group_reporting -name=Rand_Write_Testing

Rand_Write_Testing: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.1
Starting 2 processes
Rand_Write_Testing: Laying out IO file (1 file / 2048MiB)
Jobs: 2 (f=2): [m(2)][100.0%][r=19.7MiB/s,w=19.8MiB/s][r=5043,w=5056 IOPS][eta 00m:00s]
Rand_Write_Testing: (groupid=0, jobs=2): err= 0: pid=13588: Wed Jan 30 10:41:25 2019
   read: IOPS=5024, BW=19.6MiB/s (20.6MB/s)(1963MiB/100019msec)
    slat (usec): min=2, max=80213, avg=191.32, stdev=2491.48
    clat (usec): min=1022, max=176786, avg=19281.58, stdev=23666.08
     lat (usec): min=1031, max=176791, avg=19473.50, stdev=23757.08
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    4], 10.00th=[    5], 20.00th=[    6],
     | 30.00th=[    7], 40.00th=[    8], 50.00th=[    9], 60.00th=[   10],
     | 70.00th=[   12], 80.00th=[   25], 90.00th=[   67], 95.00th=[   73],
     | 99.00th=[   81], 99.50th=[   85], 99.90th=[   93], 99.95th=[   96],
     | 99.99th=[  104]
   bw (  KiB/s): min= 9304, max=10706, per=49.99%, avg=10046.66, stdev=243.04, samples=400
   iops        : min= 2326, max= 2676, avg=2511.62, stdev=60.73, samples=400
  write: IOPS=5035, BW=19.7MiB/s (20.6MB/s)(1967MiB/100019msec)
    slat (usec): min=3, max=76025, avg=197.61, stdev=2504.40
    clat (msec): min=2, max=155, avg=31.21, stdev=27.12
     lat (msec): min=2, max=155, avg=31.40, stdev=27.16
    clat percentiles (msec)
     |  1.00th=[    7],  5.00th=[    9], 10.00th=[   10], 20.00th=[   11],
     | 30.00th=[   13], 40.00th=[   14], 50.00th=[   15], 60.00th=[   18],
     | 70.00th=[   48], 80.00th=[   68], 90.00th=[   75], 95.00th=[   80],
     | 99.00th=[   88], 99.50th=[   93], 99.90th=[  102], 99.95th=[  104],
     | 99.99th=[  112]
   bw (  KiB/s): min= 9208, max=10784, per=49.98%, avg=10066.14, stdev=214.26, samples=400
   iops        : min= 2302, max= 2696, avg=2516.50, stdev=53.54, samples=400
  lat (msec)   : 2=0.04%, 4=3.09%, 10=34.76%, 20=33.31%, 50=5.05%
  lat (msec)   : 100=23.67%, 250=0.08%
  cpu          : usr=1.54%, sys=5.80%, ctx=286367, majf=0, minf=20
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=502523,503598,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=19.6MiB/s (20.6MB/s), 19.6MiB/s-19.6MiB/s (20.6MB/s-20.6MB/s), io=1963MiB (2058MB), run=100019-100019msec
  WRITE: bw=19.7MiB/s (20.6MB/s), 19.7MiB/s-19.7MiB/s (20.6MB/s-20.6MB/s), io=1967MiB (2063MB), run=100019-100019msec

Disk stats (read/write):
  vdb: ios=502513/504275, merge=0/658, ticks=4271124/7962372, in_queue=11283349, util=92.48%

這裏看到隨機讀寫性能損失約 27%多,這個結論並沒有太多參考意義,用戶應當根據自己實際場景進行測試

改用ceph共享宿主機網絡模式進行測試,結果差不多,並無性能提升

要想排除在容器內測試因素的影響,也可以直接在宿主機上對塊設備進行測試,做法很簡單,可以把塊掛到別的目錄上如:


umount /dev/rbd0
mkdir /data1
mount /dev/rbd0 /data1
touch /data1/test  # 然後對這個文件測試,我這邊測試結果與容器內差不多

bluestore方式

直接使用裸盤而不使用分區或者文件系統的方式部署ceph


storage:
  useAllNodes: false
  useAllDevices: false
  deviceFilter:
  location:
  config:
    storeType: bluestore
  nodes:
  - name: "ke-dev1-worker1"
    devices:
    - name: "vde"
  - name: "ke-dev1-worker3"
    devices:
    - name: "vde"
  - name: "ke-dev1-worker4"
    devices:
    - name: "vdf"

bluestore模式能顯著提升ceph性能,我這邊測試隨機讀寫性能提升了8%左右


Rand_Write_Testing: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128
...
fio-3.1
Starting 2 processes
Rand_Write_Testing: Laying out IO file (1 file / 2048MiB)
Jobs: 2 (f=2): [m(2)][100.0%][r=16.4MiB/s,w=16.7MiB/s][r=4189,w=4284 IOPS][eta 00m:00s]]
Rand_Write_Testing: (groupid=0, jobs=2): err= 0: pid=25: Thu Jan 31 11:37:39 2019
   read: IOPS=3990, BW=15.6MiB/s (16.3MB/s)(1566MiB/100464msec)
    slat (usec): min=2, max=246625, avg=239.16, stdev=1067.33
    clat (msec): min=2, max=493, avg=27.68, stdev=16.71
     lat (msec): min=2, max=493, avg=27.92, stdev=16.75
    clat percentiles (msec):
     |  1.00th=[   12],  5.00th=[   15], 10.00th=[   16], 20.00th=[   18],
     | 30.00th=[   20], 40.00th=[   22], 50.00th=[   24], 60.00th=[   27],
     | 70.00th=[   30], 80.00th=[   35], 90.00th=[   45], 95.00th=[   54],
     | 99.00th=[   74], 99.50th=[   84], 99.90th=[  199], 99.95th=[  334],
     | 99.99th=[  485]
   bw (  KiB/s): min= 2118, max=10456, per=50.20%, avg=8013.40, stdev=1255.78, samples=400
   iops        : min=  529, max= 2614, avg=2003.31, stdev=313.95, samples=400
  write: IOPS=3997, BW=15.6MiB/s (16.4MB/s)(1569MiB/100464msec)
    slat (usec): min=3, max=273211, avg=246.87, stdev=1026.98
    clat (msec): min=11, max=499, avg=35.90, stdev=18.04
     lat (msec): min=12, max=506, avg=36.15, stdev=18.08
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   22], 10.00th=[   23], 20.00th=[   26],
     | 30.00th=[   28], 40.00th=[   30], 50.00th=[   33], 60.00th=[   35],
     | 70.00th=[   39], 80.00th=[   44], 90.00th=[   54], 95.00th=[   63],
     | 99.00th=[   85], 99.50th=[   95], 99.90th=[  309], 99.95th=[  351],
     | 99.99th=[  489]
   bw (  KiB/s): min= 2141, max=10163, per=50.20%, avg=8026.23, stdev=1251.78, samples=400
   iops        : min=  535, max= 2540, avg=2006.51, stdev=312.94, samples=400
  lat (msec)   : 4=0.01%, 10=0.10%, 20=16.63%, 50=73.71%, 100=9.25%
  lat (msec)   : 250=0.22%, 500=0.09%
  cpu          : usr=1.85%, sys=5.60%, ctx=366744, majf=0, minf=17
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwt: total=400928,401597,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=15.6MiB/s (16.3MB/s), 15.6MiB/s-15.6MiB/s (16.3MB/s-16.3MB/s), io=1566MiB (1642MB), run=100464-100464msec
  WRITE: bw=15.6MiB/s (16.4MB/s), 15.6MiB/s-15.6MiB/s (16.4MB/s-16.4MB/s), io=1569MiB (1645MB), run=100464-100464msec

Disk stats (read/write):
  rbd0: ios=400921/401817, merge=0/50, ticks=4341605/7883816, in_queue=12217335, util=99.96%

總結

分佈式存儲在容器集羣中充當非常重要的角色,使用容器集羣一個非常重要的理念就是把集羣當成一個整體使用,如果你在使用中還關心單個主機,比如調度到某個節點,

掛載某個節點目錄等,必然會導致不能把雲的威力百分之百發揮出來。 一旦計算存儲分離後,就可真正實現隨意漂移,對集羣維護來說是個極大的福音。

比如集羣機器過保了需要下架,那麼我們雲化的架構因爲所有東西無單點,所以只需要簡單驅逐改節點,然後下架即可,不用關心上面跑的是什麼業務,不管是有狀態還是無

狀態的都可以自動修復。 不過目前面臨最大的挑戰可能還是分佈式存儲的性能問題。 在性能要求不苛刻的場景下我是極推薦這種計算存儲分離架構的。

常見問題

注意主機時間一定要同步

1
ntpdate 0.asia.pool.ntp.org

ceph cluster無法啓動

報這個錯誤


$ kubectl logs rook-ceph-mon-a-c5f54799f-rd7s4 -n rook-ceph
2019-01-27 11:04:59.985 7f0a34a4f140 -1 rocksdb: Invalid argument: /var/lib/rook/mon-a/data/store.db: does not exist (create_if_missing is false)
2019-01-27 11:04:59.985 7f0a34a4f140 -1 rocksdb: Invalid argument: /var/lib/rook/mon-a/data/store.db: does not exist (create_if_missing is false)
2019-01-27 11:04:59.985 7f0a34a4f140 -1 error opening mon data directory at '/var/lib/rook/mon-a/data': (22) Invalid argument
2019-01-27 11:04:59.985 7f0a34a4f140 -1 error opening mon data directory at '/var/lib/rook/mon-a/data': (22) Invalid argument

需要把宿主機store.db文件刪掉,然後delete pod即可, 主意別指錯目錄如果自己改了目錄的話

1
rm -rf  /var/lib/rook/mon-a/data/store.db

rook刪除PVC PV依然存在

因爲storageclass回收參數:

1
reclaimPolicy: Retain  # 設置成Delete即可

rook-ceph namespace無法刪除 cephclusters.ceph.rook.io CRD無法刪除


[root@izj6c3afuzdjhtkj156zt0z ~]# kubectl get ns
NAME          STATUS        AGE
default       Active        18h
kube-public   Active        18h
kube-system   Active        18h
monitoring    Active        18h
rook-ceph     Terminating   17h

把CRD metadata finalizers下面的內容刪了,CRD就會自動刪除,然後只要rook-ceph namespace裏沒有東西就會自動清理掉

1
$ kubectl edit crd cephclusters.ceph.rook.io

使用宿主機網絡時集羣無法正常啓動

集羣中單節點時把mon設置成1即可


mon:
    count: 1
    allowMultiplePerNode: true

  network:
    # toggle to use hostNetwork
    hostNetwork: true

探討可加QQ羣:98488045


本文分享自微信公衆號 - sealyun(gh_f33fe7b0c869)。
如有侵權,請聯繫 [email protected] 刪除。
本文參與“OSC源創計劃”,歡迎正在閱讀的你也加入,一起分享。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章