大綱
- K8S存儲整體框架及原理解析
- K8S FlexVolume存儲擴展機制
- K8S CSI存儲擴展機制
K8S存儲整體框架及原理解析
Docker插件機制-架構&評價
優點:
- 1)不重新編譯docker的情況下可以提供針對鑑權、網絡、卷等功能的擴展;
- 2)基於Http JSON-PRC格式的接口與插件進行交互;
- 3)支持容器化和非容器化多種部署形式;
- 4)支持插件生命週期管理: 1.52+ docker plugin命令及其API; <1.52 docker volume/network;
- 5)支持基於TLS的安全加固。
約束:
- 某些插件(認證)的添加需要重啓docker daemon
Docker Volume Plugin列表
名稱 | 描述 | 地址 |
Azure File Storage plugin |
Lets you mount Microsoft Azure File Storage shares to Docker containers as volumes using the SMB 3.0 protocol. Learn more. |
https://github.com/Azure/azurefile dockervolumedriver |
BeeGFS Volume Plugin |
An open source volume plugin to create persistent volumes in a BeeGFS parallel file system. | https://github.com/RedCoolBeans/ docker-volume-beegfs |
Blockbridge plugin | A volume plugin that provides access to an extensible set of container-based persistent storage options. It supports single and multi-host Docker environments with features that include tenant isolation, automated provisioning, encryption, secure deletion, snapshots and QoS. |
https://github.com/blockbridge/blo ckbridge-docker-volume |
Contiv Volume Plugin |
An open source volume plugin that provides multi-tenant, persistent, distributed storage with intent based consumption. It has support for Ceph and NFS. |
https://github.com/rancher/convoy |
DigitalOcean Block Storage plugin |
Integrates DigitalOcean’s block storage solution into the Docker ecosystem by automatically attaching a given block storage volume to a DigitalOcean droplet and making the contents of the volume available to Docker containers running on that droplet. |
https://github.com/omallo/docker volume-plugin-dostorage |
DRBD plugin | A volume plugin that provides highly available storage replicated by DRBD. Data written to the docker volume is replicated in a cluster of DRBD nodes. |
https://www.drbd.org/en/supporte d-projects/docker |
Flocker plugin | A volume plugin that provides multi-host portable volumes for Docker, enabling you to run databases and other stateful containers and move them around across a cluster of machines. |
https://clusterhq.com/docker plugin/ |
Fuxi Volume Plugin | A volume plugin that is developed as part of the OpenStack Kuryr project and implements the Docker volume plugin API by utilizing Cinder, the OpenStack block storage service. |
https://github.com/openstack/fuxi |
gce-docker plugin | A volume plugin able to attach, format and mount Google Compute persistent-disks. | https://github.com/mcuadros/gce docker |
GlusterFS plugin | A volume plugin that provides multi-host volumes management for Docker using GlusterFS. | https://github.com/calavera/docker -volume-glusterfs |
名稱 | 描述 | 地址 |
Horcrux Volume Plugin |
A volume plugin that allows on-demand, version controlled access to your data. Horcrux is an open-source plugin, written in Go, and supports SCP, Minio and Amazon S3. |
https://github.com/muthu-r/horcrux |
HPE 3Par Volume Plugin |
A volume plugin that supports HPE 3Par and StoreVirtual iSCSI storage arrays. | https://github.com/hpe storage/python-hpedockerplugin/ - |
Infinit volume plugin |
A volume plugin that makes it easy to mount and manage Infinit volumes using Docker. | https://infinit.sh/documentation/dock er/volume-plugin |
IPFS Volume Plugin |
An open source volume plugin that allows using an ipfs filesystem as a volume. | http://github.com/vdemeester/docker -volume-ipfs |
Keywhiz plugin | A plugin that provides credentials and secret management using Keywhiz as a central repository. | https://github.com/calavera/docker volume-keywhiz |
Local Persist Plugin |
A volume plugin that extends the default local driver’s functionality by allowing you specify a mountpoint anywhere on the host, which enables the files to always persist, even if the volume is removed via docker volume rm. |
https://github.com/CWSpear/local persist |
NetApp Plugin(nDVP) |
A volume plugin that provides direct integration with the Docker ecosystem for the NetApp storage portfolio. The nDVP package supports the provisioning and management of storage resources from the storage platform to Docker hosts, with a robust framework for adding additional platforms in the future. |
https://github.com/NetApp/netappdv p |
Netshare plugin | A volume plugin that provides volume management for NFS 3/4, AWS EFS and CIFS file systems. | https://github.com/ContainX/docker volume-netshare |
Nimble Storage Volume Plugin |
A volume plug-in that integrates with Nimble Storage Unified Flash Fabric arrays. The plug-in abstracts array volume capabilities to the Docker administrator to allow self-provisioning of secure multi-tenant volumes and clones. |
https://connect.nimblestorage.com/co mmunity/app-integration/docker |
名稱 | 描述 | 地址 |
OpenStorage Plugin | A cluster-aware volume plugin that provides volume management for file and block storage solutions. It implements a vendor neutral specification for implementing extensions such as CoS, encryption, and snapshots. It has example drivers based on FUSE, NFS, NBD and EBS to name a few. |
https://github.com/libopenstorage/ openstorage |
Portworx Volume Plugin | A volume plugin that turns any server into a scale-out converged compute/storage node, providing container granular storage and highly available volumes across any node, using a shared-nothing storage backend that works with any docker scheduler. |
https://github.com/portworx/px dev |
Quobyte Volume Plugin | A volume plugin that connects Docker to Quobyte’s data center file system, a general-purpose scalable and fault-tolerant storage platform. |
https://github.com/quobyte/docker -volume |
REX-Ray plugin | A volume plugin which is written in Go and provides advanced storage functionality for many platforms including VirtualBox, EC2, Google Compute Engine, OpenStack, and EMC. |
https://github.com/emccode/rexray |
Virtuozzo Storage and Ploop plugin |
A volume plugin with support for Virtuozzo Storage distributed cloud file system as well as ploop devices. | https://github.com/virtuozzo/docke r-volume-ploop |
VMware vSphere Storage Plugin |
Docker Volume Driver for vSphere enables customers to address persistent storage requirements for Docker containers in vSphere environments. |
https://github.com/vmware/docker volume-vsphere |
支持Kubernetes的平臺和存儲服務
K8S 存儲能力-Volume概述
- K8S中的普通Volume提供了在容器中掛卷的能力,它不是獨立的K8S資源對象,不能通過k8s去管理(創建、刪除等),只能在創建Pod時去引用。
- Pod需要設置捲來源( spec.volume ) 和掛載點( spec.containers.volumeMounts ) 兩個信息後纔可以使用相應的Volume
K8S 存儲能力: In-Tree Volume Plugins
K8S的VolumePlugin提供了插件化擴展存儲的機制,分爲內置插件(In-Tree Plugins)和外置插件(Out-of-Tree) 兩種
名稱 | 描述 |
awsElasticBlockStore | mounts an Amazon Web Services (AWS) EBS Volume (Elastic Block Store) |
azureDisk | is used to mount a Microsoft Azure Data Disk into a Pod. |
azureFile | is used to mount a Microsoft Azure File Volume (SMB 2.1 and 3.0) into a Pod. |
cephfs | allows an existing CephFS volume to be mounted into your pod. |
cinder | is used to mount OpenStack Block Storage into a pod. |
configMap | The data stored in a ConfigMap object can be referenced in a volume of type configMap and then consumed by containerized applications running in a Pod. |
downwardAPI | is used to make downward API data available to applications. It mounts a directory and writes the requested data in plain text files |
emptyDir | is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node. When a Pod is removed from a node for any reason, the data in the emptyDir is deleted forever. |
fc (fibre channel) | allows an existing fibre channel volume to be mounted in a pod |
flocker | allows a Flocker dataset to be mounted into a pod. |
gcePersistentDisk | mounts a Google Compute Engine (GCE) Persistent Disk into your pod. |
gitRepo | mounts an empty directory and clones a git repository into it for your pod to use. |
glusterfs | allows a Glusterfs (an open source networked filesystem) volume to be mounted into your pod |
hostPath | mounts a file or directory from the host node’s filesystem into your pod. |
iscsi | allows an existing iSCSI (SCSI over IP) volume to be mounted into your pod |
local | represents a mounted local storage device such as a disk, partition or directory. can only be used as a statically created PersistentVolume. |
名稱 | 描述 |
nfs | allows an existing NFS (Network File System) share to be mounted into your pod |
persistentVolumeClaim | is used to mount a PersistentVolume into a pod. |
projected | maps several existing volume sources into the same directory. |
portworxVolume | can be dynamically created through Kubernetes or it can also be pre-provisioned and referenced inside a Kubernetes pod. |
quobyte | allows an existing Quobyte volume to be mounted into your pod. |
rbd | allows a Rados Block Device volume to be mounted into your pod. |
scaleIO | ScaleIO is a software-based storage platform that can use existing hardware to create clusters of scalable shared block networked storage. The ScaleIO volume plugin allows deployed pods to access existing ScaleIO volumes |
secret | is used to pass sensitive information, such as passwords, to pods |
storageos | allows an existing StorageOS volume to be mounted into your pod. StorageOS provides block storage to containers, accessible via a file system. |
vsphereVolume | used to mount a vSphere VMDK Volume into your Pod. |
K8S 存儲能力-PersistentVolume
Kubernetes通過Persistent Volume子系統API對管理員和用戶提供了存儲資源創建和使用的抽象
- FlexVolume: 此Volume Driver允許不同廠商去開發他們自己的驅動來掛載捲到計算節點
- PersistentVolumeClaim: K8提供的資源抽象的Volume Driver,讓用戶不用關心具體的Volume的實現細節
K8S FlexVolume存儲擴展機制
K8S 1.5引入1.8GA的Out-Of-Tree Volume Plugin:
- driver以二進制命令行形式實現FlexVolume API,以供Controller-Manager和Kubelet調用,對外接口實現容易;
- DaemonSet方式部署確保Master和Node上都會將driver安裝到插件目錄;
- Docker鏡像+yaml配置的交付形式
Flex Volume Driver部署腳本和配置
部署腳本假設所需要的驅動二進制文件,且已經被至於部署鏡像的/$DRIVER目錄下:啓動腳本將二進制driver重名名爲.driver,再拷貝到<plugindir>/<vendor~driver>/.driver目錄下,接着使用mv將其重命名爲driver(確保驅動安裝的原子性),最後進入死循環確保容器活着。
Flex Volume CLI API
步驟 | 命令 | 描述 |
Init | <driver executable> init | 初始化驅動。在Kubelet和Controller-Manager初始化時被調用。若調用成功則需要返回一個展示對應驅動 所支持的FlexVolume能力的map,現在只包含一個必填字段attach,用於表明本驅動是否需要attach和 detach操作。爲向後兼容該字段一般默認值設爲true。 |
Attach | <driver executable> attach <json options> <node name> |
將給定規格的卷添加到給定的主機上。若調用成功則返回存儲設備添加到該主機的路徑。 Kubelet和 Controller-Manager都需要調用該方法。 |
Detach | <driver executable> detach <mount device> <node name> |
卸載給定主機上的指定卷。 Kubelet和Controller-Manager都需要調用該方法。 |
Wait for attach | <driver executable> waitforattach <mount device> <json options> |
等待卷被添加到遠程節點。若調用成功則將返回設備路徑。 Kubelet和Controller-Manager都需要調用該方 法。 |
Volume is Attached |
<driver executable> isattached <json options> <node name> |
檢查卷是否已被添加到節點上。 Kubelet和Controller-Manager都需要調用該方法。 |
Mount device | <driver executable> mountdevice <mount dir> <mount device> <json options> |
將存儲設備掛載到一個將被pod使用的全局路徑上。 Kubelet需要調用該方法。 |
Unmount device | <driver executable> unmountdevice <mount device> | 將存儲設備卸載。 This is called once all bind mounts have been unmounted. Kubelet需要調用該方法。 |
Mount | <driver executable> mount <mount dir> <json options> | 將卷掛載到指定目錄。 Kubelet需要調用該方法。 |
Unmount | <driver executable> unmount <mount dir> | 將捲進行卸載。 Kubelet需要調用該方法。 |
K8S CSI存儲擴展機制
術語 | 含義 |
CO | 容器編排系統(Container Orchestrator),使用CSI gRPC服務來與插件通信 |
RPC | 遠程方法調用(Remote Procedure Call) |
Plugin | 插件實現,實現CSI服務的gRPC訪問端點 |
SP | 存儲提供商(Storage Provider),負責提供CSI插件實現 |
Volume | 卷, CO管理的容器可使用的存儲單元 |
Block Volume | 塊設備卷 |
Mounted Volume | 使用指定文件系統掛載到容器的卷,並顯示爲容器內的一個目錄 |
Workload | 工作負載,是CO任務調度的基本單元,可以是一個或一組容器 |
Node | 用戶運行工作負載的主機,從插件的角度通過節點 ID來進行唯一標識 |
In-Tree | 內置的,存在於K8S核心代碼倉庫內的代碼 |
Out-Of-Tree | 外置的,存在於K8S核心代碼倉庫外的代碼 |
CSI Volume Plugin | 一個新的內置卷插件,作爲一個適配器來使得外置的第三方CSI卷驅動可以被K8S所使用 |
CSI Volume Driver | 一個外置的CSI兼容的卷插件驅動,可通過K8S卷插件被K8S所使用 |
CSI通用架構
CO通過gRPC與插件交互,每個SP必須實現以下兩個plugin:
• Node Plugin: 需要運行在使用Volume的Node上,主要負責Volume Mount/Unmount等操作
• Controller Plugin:可以運行在任何節點上,主要負責Volume Creation/Deletion、 Attach/Detach等操作
CO與Plugin的交互: 01.卷的創建和Attach
• 卷的Mount系列操作是由Workload的啓動所觸發的
• K8S中volume的全局掛載路徑(存儲掛載點)格式樣例:
/var/lib/kubelet/plugins/kubernetes.io/$volume_plugin/mounts/$volume_name
• K8S中volume的workload掛載路徑(軟鏈)格式樣例:
/var/lib/kubelet/pods/$pod_id/volumes/$volume_plugin/$volume_name
卷的生命週期
RPC接口集合-Identity
CSI規範定義了3類RPC集合:
• Identity Service: Node Plugin和Controller Plugin都需要實現的RPC集合
• Controller Service: Controller Plugin需要實現的RPC集合
• Node Service: Node Plugin需要實現的RPC集合
Identity Service RPC:身份服務RPC允許CO查詢插件的功能,健康狀況和其他元數據。
RPC接口集合-Controller
Controller Service RPC: 控制服務RPC提供卷的創建、刪除、 Attach、 Detach、查詢等功能,以及卷快照的創建、刪除、查詢等功能
RPC接口集合-Node
K8S CSI架構
K8S 1.9實現了CSI plugin alpha版本, 1.11版本已升至Beta
爲了部署一個容器化的第三方CSI volume
driver,存儲提供商需要執行如下操作:
1. 創建一個實現CSI規範描述的插件功能,並通過Unix套接字來暴露gPRC訪問接口的”CSI volume driver” 容器;
2. 結合使用K8S團隊提供的幫助容器來部署CSI volume driver,具體需要創建如下兩類K8S對象:
- 1) StatefulSet:用於與K8S控制器進行交互,實例數1,包含3個容器( CSI volume driver、 external-attacher 、 externalprovisioner ),需要掛載一個掛載點爲/var/lib/csi/sockets/pluginproxy/的emptyDir volume
- 2) DaemonSet :包含2個容器( CSI volumedriver、 K8S CSI Helper),掛載3個hostpath volume
3. 集羣管理員爲存儲系統在K8S集羣中部署上述StatefulSet和DaemonSet