Kubernetes搭建rook-ceph

轉載自：https://blog.51cto.com/bigboss/2320016?source=drha

簡介

Rook官網：https://rook.io
Rook是雲原生計算基金會(CNCF)的孵化級項目.
Rook是Kubernetes的開源雲本地存儲協調器，爲各種存儲解決方案提供平臺，框架和支持，以便與雲原生環境本地集成。
至於CEPH，官網在這：https://ceph.com/
ceph官方提供的helm部署，至今我沒成功過，所以轉向使用rook提供的方案

有道筆記原文：http://note.youdao.com/noteshare?id=281719f1f0374f787effc90067e0d5ad&sub=0B59EA339D4A4769B55F008D72C1A4C0

環境

centos 7.5kernel 4.18.7-1.el7.elrepo.x86_64

docker 18.06kubernetes v1.12.2
    kubeadm部署：
        網絡: canal
        DNS: coredns
    集羣成員：
    192.168.1.1 kube-master    192.168.1.2 kube-node1    192.168.1.3 kube-node2    192.168.1.4 kube-node3    192.168.1.5 kube-node4

所有node節點準備一塊200G的磁盤：/dev/sdb

準備工作

所有節點開啓ip_forward

cat <<EOF > /etc/sysctl.d/ceph.conf
net.ipv4.ip_forward = 1net.bridge.bridge-nf-call-ip6tables = 1net.bridge.bridge-nf-call-iptables = 1EOF
sysctl --system

開始部署Operator

部署Rook Operator

#無另外說明，全部操作都在master操作cd $HOMEgit clone https://github.com/rook/rook.gitcd rookcd cluster/examples/kubernetes/ceph
kubectl apply -f operator.yaml

查看Operator的狀態

#執行apply之後稍等一會。#operator會在集羣內的每個主機創建兩個pod:rook-discover,rook-ceph-agentkubectl -n rook-ceph-system get pod -o wide

給節點打標籤

運行ceph-mon的節點打上：ceph-mon=enabled

kubectl label nodes {kube-node1,kube-node2,kube-node3} ceph-mon=enabled

運行ceph-osd的節點，也就是存儲節點，打上：ceph-osd=enabled

kubectl label nodes {kube-node1,kube-node2,kube-node3} ceph-osd=enabled

運行ceph-mgr的節點，打上：ceph-mgr=enabled

#mgr只能支持一個節點運行，這是ceph跑k8s裏的侷限kubectl label nodes kube-node1 ceph-mgr=enabled

配置cluster.yaml文件

官方配置文件詳解：https://rook.io/docs/rook/v0.8/ceph-cluster-crd.html
文件中有幾個地方要注意：
- dataDirHostPath: 這個路徑是會在宿主機上生成的，保存的是ceph的相關的配置文件，再重新生成集羣的時候要確保這個目錄爲空，否則mon會無法啓動
- useAllDevices: 使用所有的設備，建議爲false，否則會把宿主機所有可用的磁盤都幹掉
- useAllNodes：使用所有的node節點，建議爲false，肯定不會用k8s集羣內的所有node來搭建ceph的
- databaseSizeMB和journalSizeMB：當磁盤大於100G的時候，就註釋這倆項就行了
本次實驗用到的 cluster.yaml 文件內容如下：

apiVersion: v1kind: Namespacemetadata:
  name: rook-ceph
---apiVersion: v1kind: ServiceAccountmetadata:
  name: rook-ceph-cluster
  namespace: rook-ceph
---kind: RoleapiVersion: rbac.authorization.k8s.io/v1beta1metadata:
  name: rook-ceph-cluster
  namespace: rook-cephrules:- apiGroups: [""]
  resources: ["configmaps"]
  verbs: [ "get", "list", "watch", "create", "update", "delete" ]
---# Allow the operator to create resources in this cluster's namespacekind: RoleBindingapiVersion: rbac.authorization.k8s.io/v1beta1metadata:
  name: rook-ceph-cluster-mgmt
  namespace: rook-cephroleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rook-ceph-cluster-mgmtsubjects:- kind: ServiceAccount
  name: rook-ceph-system
  namespace: rook-ceph-system
---# Allow the pods in this namespace to work with configmapskind: RoleBindingapiVersion: rbac.authorization.k8s.io/v1beta1metadata:
  name: rook-ceph-cluster
  namespace: rook-cephroleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rook-ceph-clustersubjects:- kind: ServiceAccount
  name: rook-ceph-cluster
  namespace: rook-ceph
---apiVersion: ceph.rook.io/v1beta1kind: Clustermetadata:
  name: rook-ceph
  namespace: rook-cephspec:
  cephVersion:    # The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw).
    # v12 is luminous, v13 is mimic, and v14 is nautilus.
    # RECOMMENDATION: In production, use a specific version tag instead of the general v13 flag, which pulls the latest release and could result in different
    # versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/.
    image: ceph/ceph:v13    # Whether to allow unsupported versions of Ceph. Currently only luminous and mimic are supported.
    # After nautilus is released, Rook will be updated to support nautilus.
    # Do not set to true in production.
    allowUnsupported: false  # The path on the host where configuration files will be persisted. If not specified, a kubernetes emptyDir will be created (not recommended).
  # Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster.
  # In Minikube, the '/data' directory is configured to persist across reboots. Use "/data/rook" in Minikube environment.
  dataDirHostPath: /var/lib/rook  # The service account under which to run the daemon pods in this cluster if the default account is not sufficient (OSDs)
  serviceAccount: rook-ceph-cluster  # set the amount of mons to be started
  # count可以定義ceph-mon運行的數量，這裏默認三個就行了
  mon:
    count: 3
    allowMultiplePerNode: true  # enable the ceph dashboard for viewing cluster status
  # 開啓ceph資源面板
  dashboard:
    enabled: true    # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
    # urlPrefix: /ceph-dashboard
  network:    # toggle to use hostNetwork
    # 使用宿主機的網絡進行通訊
    # 使用宿主機的網絡貌似可以讓集羣外的主機掛載ceph
    # 但是我沒試過，有興趣的兄弟可以試試改成true
    # 反正這裏只是集羣內用，我就不改了
    hostNetwork: false  # To control where various services will be scheduled by kubernetes, use the placement configuration sections below.
  # The example under 'all' would have all services scheduled on kubernetes nodes labeled with 'role=storage-node' and
  # tolerate taints with a key of 'storage-node'.
  placement:#    all:#      nodeAffinity:#        requiredDuringSchedulingIgnoredDuringExecution:#          nodeSelectorTerms:#          - matchExpressions:#            - key: role#              operator: In#              values:#              - storage-node#      podAffinity:#      podAntiAffinity:#      tolerations:#      - key: storage-node#        operator: Exists# The above placement information can also be specified for mon, osd, and mgr components#    mon:#    osd:#    mgr:# nodeAffinity：通過選擇標籤的方式，可以限制pod被調度到特定的節點上# 建議限制一下，爲了讓這幾個pod不亂跑
    mon:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-mon
              operator: In
              values:
              - enabled
    osd:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-osd
              operator: In
              values:
              - enabled
    mgr:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-mgr
              operator: In
              values:
              - enabled
  resources:# The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory#    mgr:#      limits:#        cpu: "500m"#        memory: "1024Mi"#      requests:#        cpu: "500m"#        memory: "1024Mi"# The above example requests/limits can also be added to the mon and osd components#    mon:#    osd:
  storage: # cluster level storage configuration and selection
    useAllNodes: false
    useAllDevices: false
    deviceFilter:
    location:
    config:      # The default and recommended storeType is dynamically set to bluestore for devices and filestore for directories.
      # Set the storeType explicitly only if it is required not to use the default.
      # storeType: bluestore
      # databaseSizeMB: "1024" # this value can be removed for environments with normal sized disks (100 GB or larger)
      # journalSizeMB: "1024"  # this value can be removed for environments with normal sized disks (20 GB or larger)# Cluster level list of directories to use for storage. These values will be set for all nodes that have no `directories` set.#    directories:#    - path: /rook/storage-dir# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named# nodes below will be used as storage resources.  Each node's 'name' field should match their 'kubernetes.io/hostname' label.#建議磁盤配置方式如下：#name: 選擇一個節點，節點名字爲kubernetes.io/hostname的標籤，也就是kubectl get nodes看到的名字#devices: 選擇磁盤設置爲OSD# - name: "sdb":將/dev/sdb設置爲osd
    nodes:
    - name: "kube-node1"
      devices:
      - name: "sdb"
    - name: "kube-node2"
      devices:
      - name: "sdb"
    - name: "kube-node3"
      devices:
      - name: "sdb"#      directories: # specific directories to use for storage can be specified for each node#      - path: "/rook/storage-dir"#      resources:#        limits:#          cpu: "500m"#          memory: "1024Mi"#        requests:#          cpu: "500m"#          memory: "1024Mi"#    - name: "172.17.4.201"#      devices: # specific devices to use for storage can be specified for each node#      - name: "sdb"#      - name: "sdc"#      config: # configuration can be specified at the node level which overrides the cluster level config#        storeType: filestore#    - name: "172.17.4.301"#      deviceFilter: "^sd."

開始部署ceph

部署ceph

kubectl apply -f cluster.yaml# cluster會在rook-ceph這個namesapce創建資源# 盯着這個namesapce的pod你就會發現，它在按照順序創建Podkubectl -n rook-ceph get pod -o wide -w# 看到所有的pod都Running就行了# 注意看一下pod分佈的宿主機，跟我們打標籤的主機是一致的kubectl -n rook-ceph get pod -o wide

切換到其他主機看一下磁盤
```
lsblk
```
```
lsblk
```
- 切換到kube-node3
- 切換到kube-node1

配置ceph dashboard

看一眼dashboard在哪個service上

kubectl -n rook-ceph get service#可以看到dashboard監聽了8443端口

創建個nodeport類型的service以便集羣外部訪問

kubectl apply -f dashboard-external-https.yaml# 查看一下nodeport在哪個端口ss -tanl
kubectl -n rook-ceph get service

找出Dashboard的登陸賬號和密碼

MGR_POD=`kubectl get pod -n rook-ceph | grep mgr | awk '{print $1}'`kubectl -n rook-ceph logs $MGR_POD | grep password

打開瀏覽器輸入任意一個Node的IP+nodeport端口
這裏我的就是：https://192.168.1.2:30290

配置ceph爲storageclass

官方給了一個樣本文件：storageclass.yaml
這個文件使用的是 RBD 塊存儲
pool創建詳解：https://rook.io/docs/rook/v0.8/ceph-pool-crd.html

apiVersion: ceph.rook.io/v1beta1kind: Poolmetadata:
  #這個name就是創建成ceph pool之後的pool名字
  name: replicapool
  namespace: rook-cephspec:
  replicated:
    size: 1  # size 池中數據的副本數,1就是不保存任何副本
  failureDomain: osd  #  failureDomain：數據塊的故障域，
  #  值爲host時，每個數據塊將放置在不同的主機上
  #  值爲osd時，每個數據塊將放置在不同的osd上---apiVersion: storage.k8s.io/v1kind: StorageClassmetadata:
   name: ceph   # StorageClass的名字，pvc調用時填的名字provisioner: ceph.rook.io/blockparameters:
  pool: replicapool  # Specify the namespace of the rook cluster from which to create volumes.
  # If not specified, it will use `rook` as the default namespace of the cluster.
  # This is also the namespace where the cluster will be
  clusterNamespace: rook-ceph  # Specify the filesystem type of the volume. If not specified, it will use `ext4`.
  fstype: xfs# 設置回收策略默認爲：RetainreclaimPolicy: Retain

創建StorageClass

kubectl apply -f storageclass.yamlkubectl get storageclasses.storage.k8s.io -n rook-cephkubectl describe storageclasses.storage.k8s.io -n rook-ceph

創建個nginx pod嘗試掛載

cat << EOF > nginx.yaml
---apiVersion: v1kind: PersistentVolumeClaimmetadata:
  name: nginx-pvcspec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: ceph

---apiVersion: v1kind: Servicemetadata:
  name: nginxspec:
  selector:
    app: nginx
  ports:
  - port: 80
    name: nginx-port
    targetPort: 80
    protocol: TCP

---apiVersion: apps/v1kind: Deploymentmetadata:
  name: nginxspec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
        volumeMounts:
        - mountPath: /html
          name: http-file
      volumes:
      - name: http-file
        persistentVolumeClaim:
          claimName: nginx-pvc
EOF

kubectl apply -f nginx.yaml

查看pv,pvc是否創建了

kubectl get pv,pvc# 看一下nginx這個pod也運行了kubectl get pod

刪除這個pod,看pv是否還存在

kubectl delete -f nginx.yaml

kubectl get pv,pvc# 可以看到，pod和pvc都已經被刪除了，但是pv還在！！！

添加新的OSD進入集羣

這次我們要把node4添加進集羣，先打標籤

kubectl label nodes kube-node4 ceph-osd=enabled

重新編輯cluster.yaml文件

# 原來的基礎上添加node4的信息cd $HOME/rook/cluster/examples/kubernetes/ceph/
vi cluster.yam

apply一下cluster.yaml文件

kubectl apply -f cluster.yaml# 盯着rook-ceph名稱空間,集羣會自動添加node4進來kubectl -n rook-ceph get pod -o wide -w
kubectl -n rook-ceph get pod -o wide

去node4節點看一下磁盤

lsblk

再打開dashboard看一眼

刪除一個節點

去掉node3的標籤

kubectl label nodes kube-node3 ceph-osd-

重新編輯cluster.yaml文件

# 刪除node3的信息cd $HOME/rook/cluster/examples/kubernetes/ceph/
vi cluster.yam

apply一下cluster.yaml文件

kubectl apply -f cluster.yaml# 盯着rook-ceph名稱空間kubectl -n rook-ceph get pod -o wide -w
kubectl -n rook-ceph get pod -o wide# 最後記得刪除宿主機的/var/lib/rook文件夾

常見問題

官方解答：https://rook.io/docs/rook/v0.8/common-issues.html
當機器重啓之後，osd無法正常的Running，無限重啓

#解決辦法：# 標記節點爲 drain 狀態kubectl drain <node-name> --ignore-daemonsets --delete-local-data# 然後再恢復kubectl uncordon <node-name>

Kubernetes搭建rook-ceph

轉載自：https://blog.51cto.com/bigboss/2320016?source=drha

簡介

環境

準備工作

開始部署Operator

給節點打標籤

配置cluster.yaml文件

開始部署ceph

配置ceph dashboard

配置ceph爲storageclass

添加新的OSD進入集羣

刪除一個節點

常見問題

【SQL進階】CASE語句的使用

npm error Cannot read properties of null (reading 'isDescendantOf')

MySQL 解除死鎖

Nginx日誌輸出自定義header頭字段

Python目錄結構建議

如何保護你的 Python 代碼（一）—— 現有加密方案

如何保護你的 Python 代碼（二）—— 定製 Python 解釋器

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結