K8S原理剖析:存儲原理剖析和實踐

大綱

  • K8S存儲整體框架及原理解析
  • K8S FlexVolume存儲擴展機制
  • K8S CSI存儲擴展機制

K8S存儲整體框架及原理解析

Docker插件機制-架構&評價

優點:

  • 1)不重新編譯docker的情況下可以提供針對鑑權、網絡、卷等功能的擴展;
  • 2)基於Http JSON-PRC格式的接口與插件進行交互;
  • 3)支持容器化和非容器化多種部署形式;
  • 4)支持插件生命週期管理: 1.52+  docker plugin命令及其API; <1.52  docker volume/network;
  • 5)支持基於TLS的安全加固。

約束:

  • 某些插件(認證)的添加需要重啓docker daemon

Docker Volume Plugin列表

名稱 描述 地址
Azure File Storage
plugin
Lets you mount Microsoft Azure File Storage shares to Docker containers as volumes using the SMB 3.0 protocol. Learn
more.
https://github.com/Azure/azurefile
dockervolumedriver
BeeGFS Volume
Plugin
An open source volume plugin to create persistent volumes in a BeeGFS parallel file system. https://github.com/RedCoolBeans/
docker-volume-beegfs
Blockbridge plugin A volume plugin that provides access to an extensible set of container-based persistent storage options. It supports single
and multi-host Docker environments with features that include tenant isolation, automated provisioning, encryption,
secure deletion, snapshots and QoS.
https://github.com/blockbridge/blo
ckbridge-docker-volume
Contiv Volume
Plugin
An open source volume plugin that provides multi-tenant, persistent, distributed storage with intent based consumption. It
has support for Ceph and NFS.
https://github.com/rancher/convoy
DigitalOcean Block
Storage plugin
Integrates DigitalOcean’s block storage solution into the Docker ecosystem by automatically attaching a given block storage
volume to a DigitalOcean droplet and making the contents of the volume available to Docker containers running on that
droplet.
https://github.com/omallo/docker
volume-plugin-dostorage
DRBD plugin A volume plugin that provides highly available storage replicated by DRBD. Data written to the docker volume is replicated
in a cluster of DRBD nodes.
https://www.drbd.org/en/supporte
d-projects/docker
Flocker plugin A volume plugin that provides multi-host portable volumes for Docker, enabling you to run databases and other stateful
containers and move them around across a cluster of machines.
https://clusterhq.com/docker
plugin/
Fuxi Volume Plugin A volume plugin that is developed as part of the OpenStack Kuryr project and implements the Docker volume plugin API by
utilizing Cinder, the OpenStack block storage service.
https://github.com/openstack/fuxi
gce-docker plugin A volume plugin able to attach, format and mount Google Compute persistent-disks. https://github.com/mcuadros/gce
docker
GlusterFS plugin A volume plugin that provides multi-host volumes management for Docker using GlusterFS. https://github.com/calavera/docker
-volume-glusterfs
名稱 描述 地址
Horcrux Volume
Plugin
A volume plugin that allows on-demand, version controlled access to your data. Horcrux is an open-source plugin,
written in Go, and supports SCP, Minio and Amazon S3.
https://github.com/muthu-r/horcrux
HPE 3Par Volume
Plugin
A volume plugin that supports HPE 3Par and StoreVirtual iSCSI storage arrays. https://github.com/hpe storage/python-hpedockerplugin/ -
Infinit volume
plugin
A volume plugin that makes it easy to mount and manage Infinit volumes using Docker. https://infinit.sh/documentation/dock er/volume-plugin
IPFS Volume
Plugin
An open source volume plugin that allows using an ipfs filesystem as a volume. http://github.com/vdemeester/docker -volume-ipfs
Keywhiz plugin A plugin that provides credentials and secret management using Keywhiz as a central repository. https://github.com/calavera/docker
volume-keywhiz
Local Persist
Plugin
A volume plugin that extends the default local driver’s functionality by allowing you specify a mountpoint anywhere
on the host, which enables the files to always persist, even if the volume is removed via docker volume rm.
https://github.com/CWSpear/local
persist
NetApp
Plugin(nDVP)
A volume plugin that provides direct integration with the Docker ecosystem for the NetApp storage portfolio. The
nDVP package supports the provisioning and management of storage resources from the storage platform to Docker
hosts, with a robust framework for adding additional platforms in the future.
https://github.com/NetApp/netappdv
p
Netshare plugin A volume plugin that provides volume management for NFS 3/4, AWS EFS and CIFS file systems. https://github.com/ContainX/docker
volume-netshare
Nimble Storage
Volume Plugin
A volume plug-in that integrates with Nimble Storage Unified Flash Fabric arrays. The plug-in abstracts array volume
capabilities to the Docker administrator to allow self-provisioning of secure multi-tenant volumes and clones.
https://connect.nimblestorage.com/co
mmunity/app-integration/docker
名稱 描述 地址
OpenStorage Plugin A cluster-aware volume plugin that provides volume management for file and block storage solutions. It implements a
vendor neutral specification for implementing extensions such as CoS, encryption, and snapshots. It has example drivers
based on FUSE, NFS, NBD and EBS to name a few.
https://github.com/libopenstorage/
openstorage
Portworx Volume Plugin A volume plugin that turns any server into a scale-out converged compute/storage node, providing container granular
storage and highly available volumes across any node, using a shared-nothing storage backend that works with any docker
scheduler.
https://github.com/portworx/px
dev
Quobyte Volume Plugin A volume plugin that connects Docker to Quobyte’s data center file system, a general-purpose scalable and fault-tolerant
storage platform.
https://github.com/quobyte/docker
-volume
REX-Ray plugin A volume plugin which is written in Go and provides advanced storage functionality for many platforms including VirtualBox,
EC2, Google Compute Engine, OpenStack, and EMC.
https://github.com/emccode/rexray
Virtuozzo Storage and
Ploop plugin
A volume plugin with support for Virtuozzo Storage distributed cloud file system as well as ploop devices. https://github.com/virtuozzo/docke r-volume-ploop
VMware vSphere Storage
Plugin
Docker Volume Driver for vSphere enables customers to address persistent storage requirements for Docker containers in
vSphere environments.
https://github.com/vmware/docker
volume-vsphere

支持Kubernetes的平臺和存儲服務

K8S 存儲能力-Volume概述

  • K8S中的普通Volume提供了在容器中掛卷的能力,它不是獨立的K8S資源對象,不能通過k8s去管理(創建、刪除等),只能在創建Pod時去引用。
  • Pod需要設置捲來源( spec.volume ) 和掛載點( spec.containers.volumeMounts ) 兩個信息後纔可以使用相應的Volume

K8S 存儲能力: In-Tree Volume Plugins

K8S的VolumePlugin提供了插件化擴展存儲的機制,分爲內置插件(In-Tree Plugins)和外置插件(Out-of-Tree) 兩種

名稱 描述
awsElasticBlockStore mounts an Amazon Web Services (AWS) EBS Volume (Elastic Block Store)
azureDisk is used to mount a Microsoft Azure Data Disk into a Pod.
azureFile is used to mount a Microsoft Azure File Volume (SMB 2.1 and 3.0) into a Pod.
cephfs allows an existing CephFS volume to be mounted into your pod.
cinder is used to mount OpenStack Block Storage into a pod.
configMap The data stored in a ConfigMap object can be referenced in a volume of type configMap and then consumed by containerized applications running in a Pod.
downwardAPI is used to make downward API data available to applications. It mounts a directory and writes the requested data in plain text files
emptyDir is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node. When a Pod is removed from a node for any reason, the
data in the emptyDir is deleted forever.
fc (fibre channel) allows an existing fibre channel volume to be mounted in a pod
flocker allows a Flocker dataset to be mounted into a pod.
gcePersistentDisk mounts a Google Compute Engine (GCE) Persistent Disk into your pod.
gitRepo mounts an empty directory and clones a git repository into it for your pod to use.
glusterfs allows a Glusterfs (an open source networked filesystem) volume to be mounted into your pod
hostPath mounts a file or directory from the host node’s filesystem into your pod.
iscsi allows an existing iSCSI (SCSI over IP) volume to be mounted into your pod
local represents a mounted local storage device such as a disk, partition or directory. can only be used as a statically created PersistentVolume.
名稱 描述
nfs allows an existing NFS (Network File System) share to be mounted into your pod
persistentVolumeClaim is used to mount a PersistentVolume into a pod.
projected maps several existing volume sources into the same directory.
portworxVolume can be dynamically created through Kubernetes or it can also be pre-provisioned and referenced inside a Kubernetes pod.
quobyte allows an existing Quobyte volume to be mounted into your pod.
rbd allows a Rados Block Device volume to be mounted into your pod.
scaleIO ScaleIO is a software-based storage platform that can use existing hardware to create clusters of scalable shared block networked storage. The ScaleIO
volume plugin allows deployed pods to access existing ScaleIO volumes
secret is used to pass sensitive information, such as passwords, to pods
storageos allows an existing StorageOS volume to be mounted into your pod. StorageOS provides block storage to containers, accessible via a file system.
vsphereVolume used to mount a vSphere VMDK Volume into your Pod.

K8S 存儲能力-PersistentVolume

Kubernetes通過Persistent Volume子系統API對管理員和用戶提供了存儲資源創建和使用的抽象

  • FlexVolume: 此Volume Driver允許不同廠商去開發他們自己的驅動來掛載捲到計算節點
  • PersistentVolumeClaim: K8提供的資源抽象的Volume Driver,讓用戶不用關心具體的Volume的實現細節

K8S FlexVolume存儲擴展機制

K8S 1.5引入1.8GA的Out-Of-Tree Volume Plugin:

  • driver以二進制命令行形式實現FlexVolume API,以供Controller-Manager和Kubelet調用,對外接口實現容易;
  • DaemonSet方式部署確保Master和Node上都會將driver安裝到插件目錄;
  • Docker鏡像+yaml配置的交付形式

Flex Volume Driver部署腳本和配置

部署腳本假設所需要的驅動二進制文件,且已經被至於部署鏡像的/$DRIVER目錄下:啓動腳本將二進制driver重名名爲.driver,再拷貝到<plugindir>/<vendor~driver>/.driver目錄下,接着使用mv將其重命名爲driver(確保驅動安裝的原子性),最後進入死循環確保容器活着。

Flex Volume CLI API

步驟 命令 描述
Init <driver executable> init 初始化驅動。在Kubelet和Controller-Manager初始化時被調用。若調用成功則需要返回一個展示對應驅動
所支持的FlexVolume能力的map,現在只包含一個必填字段attach,用於表明本驅動是否需要attach和
detach操作。爲向後兼容該字段一般默認值設爲true。
Attach <driver executable> attach <json options> <node
name>
將給定規格的卷添加到給定的主機上。若調用成功則返回存儲設備添加到該主機的路徑。 Kubelet和
Controller-Manager都需要調用該方法。
Detach <driver executable> detach <mount device> <node
name>
卸載給定主機上的指定卷。 Kubelet和Controller-Manager都需要調用該方法。
Wait for attach <driver executable> waitforattach <mount device>
<json options>
等待卷被添加到遠程節點。若調用成功則將返回設備路徑。 Kubelet和Controller-Manager都需要調用該方
法。
Volume is
Attached
<driver executable> isattached <json options> <node
name>
檢查卷是否已被添加到節點上。 Kubelet和Controller-Manager都需要調用該方法。
Mount device <driver executable> mountdevice <mount dir> <mount
device> <json options>
將存儲設備掛載到一個將被pod使用的全局路徑上。 Kubelet需要調用該方法。
Unmount device <driver executable> unmountdevice <mount device> 將存儲設備卸載。 This is called once all bind mounts have been unmounted. Kubelet需要調用該方法。
Mount <driver executable> mount <mount dir> <json options> 將卷掛載到指定目錄。 Kubelet需要調用該方法。
Unmount <driver executable> unmount <mount dir> 將捲進行卸載。 Kubelet需要調用該方法。

K8S CSI存儲擴展機制

術語 含義
CO 容器編排系統(Container Orchestrator),使用CSI gRPC服務來與插件通信
RPC 遠程方法調用(Remote Procedure Call)
Plugin 插件實現,實現CSI服務的gRPC訪問端點
SP 存儲提供商(Storage Provider),負責提供CSI插件實現
Volume 卷, CO管理的容器可使用的存儲單元
Block Volume 塊設備卷
Mounted Volume 使用指定文件系統掛載到容器的卷,並顯示爲容器內的一個目錄
Workload 工作負載,是CO任務調度的基本單元,可以是一個或一組容器
Node 用戶運行工作負載的主機,從插件的角度通過節點 ID來進行唯一標識
In-Tree 內置的,存在於K8S核心代碼倉庫內的代碼
Out-Of-Tree 外置的,存在於K8S核心代碼倉庫外的代碼
CSI Volume Plugin 一個新的內置卷插件,作爲一個適配器來使得外置的第三方CSI卷驅動可以被K8S所使用
CSI Volume Driver 一個外置的CSI兼容的卷插件驅動,可通過K8S卷插件被K8S所使用

CSI通用架構

CO通過gRPC與插件交互,每個SP必須實現以下兩個plugin:
• Node Plugin: 需要運行在使用Volume的Node上,主要負責Volume Mount/Unmount等操作
• Controller Plugin:可以運行在任何節點上,主要負責Volume Creation/Deletion、 Attach/Detach等操作

CO與Plugin的交互: 01.卷的創建和Attach

• 卷的Mount系列操作是由Workload的啓動所觸發的
• K8S中volume的全局掛載路徑(存儲掛載點)格式樣例:
/var/lib/kubelet/plugins/kubernetes.io/$volume_plugin/mounts/$volume_name
• K8S中volume的workload掛載路徑(軟鏈)格式樣例:
/var/lib/kubelet/pods/$pod_id/volumes/$volume_plugin/$volume_name

卷的生命週期

RPC接口集合-Identity

CSI規範定義了3類RPC集合:
• Identity Service: Node Plugin和Controller Plugin都需要實現的RPC集合
• Controller Service: Controller Plugin需要實現的RPC集合
• Node Service: Node Plugin需要實現的RPC集合
Identity Service RPC:身份服務RPC允許CO查詢插件的功能,健康狀況和其他元數據。

RPC接口集合-Controller

Controller Service RPC: 控制服務RPC提供卷的創建、刪除、 Attach、 Detach、查詢等功能,以及卷快照的創建、刪除、查詢等功能

RPC接口集合-Node

K8S CSI架構

K8S 1.9實現了CSI plugin alpha版本, 1.11版本已升至Beta

爲了部署一個容器化的第三方CSI volume
driver,存儲提供商需要執行如下操作:
1. 創建一個實現CSI規範描述的插件功能,並通過Unix套接字來暴露gPRC訪問接口的”CSI volume driver” 容器;
2. 結合使用K8S團隊提供的幫助容器來部署CSI volume driver,具體需要創建如下兩類K8S對象:

  • 1) StatefulSet:用於與K8S控制器進行交互,實例數1,包含3個容器( CSI volume driver、 external-attacher 、 externalprovisioner ),需要掛載一個掛載點爲/var/lib/csi/sockets/pluginproxy/的emptyDir volume
  • 2) DaemonSet :包含2個容器( CSI volumedriver、 K8S CSI Helper),掛載3個hostpath volume

3. 集羣管理員爲存儲系統在K8S集羣中部署上述StatefulSet和DaemonSet

發佈了410 篇原創文章 · 獲贊 1345 · 訪問量 208萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章