Recently, I have been assigned to implement a task manage system based on etcd
storage backend. Some of its design and implmentation are inspired by Kubernetes
.
As a Go
developer, dealing with Kubernetes
may be an everyday routine, and you may konw a few of it, such as the basic concepts , architecure overview , api convension and processing flow. But when you dig into its source code, you will find some of its designs may be interesting, and it provides a lot insights for Go
development, especially for NoSQL-based system as well as micro-service development.
This post focuses on design and implementation of kubernetes
, and corresponding source codes may be linked.
Table of Contents
Overview
At the first glance of the overall kubernetes project layout, it shares some common points with Standard Go Project Layout, which helps improving scalability of the project code, keeping the code less messy even getting hundred of developers involved. git clone
and go mod vendor
is a good starting point to let you navigate source code of kubernetes
.
The followings are some coding styles/conventions appered a lot among the projects. To some extent, they are good parctices.
Project Structure
Apart from kubernetes/kubernetes
, there is more than 60 pinned repositories wihin https://github.com/kubernetes
, and each of them is playing an important role to make kubernete
to be an integrated one. They locates in venodor
dirctory after running go mod vendor
.
To be more detailed, repository kubernetes/apiserver is the source code for apiserver
, locating in the path of vendor/k8s.io/apiserver
, as kubernetes/kubernetes has imported the related packages of it (check go.mod ). So does the other repositories like kubernetes/apimachinery and kubernetes/client-go.
We can see that by letting kubernetes/kubenetes
as the root repository, and the rest of pinned repositories as the imported ones, makes the relationship between package more clear. Also notice that the imported repositories share the same directory layer, as you can see in vendor/k8s.io/xxxx
.
Also, each repository has well-designed features that make it reasonable to be a single repostory. This helps each of them strictly follows the importing direction. For instance, kubernetes/apiserver
has imported kubernetes/client-go
, and both of them have imported kubernetes/apimachinery
. A clear project structure helps avoiding cycle importing and getting rid of messy project code layout. Additionally, once an issue
or feature
has been post, it is easy to locate the related repostory and developers may just extend coding base of it. And at the time when one of the repostory becomes too large or it contains common code that is useful for others, just create a new repostory. This is where the scalability comes from.
Apart from kubernete
contributors, a typical senario for common go developers to use those repostories is that when you introduce CRD to your kubernetes cluster
, and define your own Controller
, the dependencies you have to import usually are kubernetes/apimachinery
and kubernetes/client-go
, for object scheme and interaction with kubernetes/apiserver
respectively. Check this customized Spark Controller for implementation of CRD
.
The way that how kubernetes
manage packages get a big and opensource project into small components and connect them using a well-managed manner. It is a good guiline for gopher
to think of how to construct their go project in the first place. Good design benefits a lot.
Design Pattern
- Builder Pattern
kuberntes
uses Builder Pattern
a lot for caller to build client, config and interface
and various kinds of objects. Here is one of the examples to make construction of a object using a builder. However, there is no constrain to prevent developer to create the object directly.
- Factory Pattern
// todo
The Functional Option
Functional option
can be understood in a way that using function
as parameter for a function. This post has illustrated the benefits of it. kubernetes
has widely adopted functional option
// todo
Coding Tools
// todo
// deepcopy object
// generated code
API Server
Kubernetes API server is a metadata service that only deals with CRUD
of Object. It uses etcd
as its persistent storage. It is worth noticing that other componets in Kubernetes architecture
interact with API server to persist meta data.
The runtime.Object
This runtime
is not refer to the golang runtime, but for the Kubernete runtime defined in Kubernete apimachinery
, which declares Scheme, typing, encoding, decoding, and conversion packages for Kubernetes and Kubernetes-like API objects.
It is a good convention for a software project to define its own runtime
as a package, then other components in the project imports and follows the interfaces of it.
- Every
Object
follows the same schema in outer layer, which is :
{
"kind": ...
"apiVersion": ...
"spce" : ...
"status": ...
}
- Every
Object
has embed TypeMeta, andTypeMeta
has implemented runtime Object. So that everyObject
likeDeployment
,Pod
and so on has implementruntime Object Interface
- kubernetes/apiserver/pkg/storage defines interface to interact with etcd storage, and its parameters used to serialize or deserialize etcd value are
runtime Object
. - kubernetes/apiserver/pkg/endpoints/handlers defines handler for all
Object
, in whichruntime Object
is being used. The business logic to create anObject
is easy to understand here. Thus no need to implement different handler for differentObject
implementation kubernetes/apimachinery/pkg/runtime
(Kubernete runtime) has introduced a GroupVersionKind struct used to described an concretedObjcet
,
type GroupVersionKind struct {
Group string
Version string
Kind string
}
in which Group indicates the group an Object belong to, e.g. Pod
has empty Group
, while some CRD
like cert-manager has Group
of cert-manager.io
. It can be used to identitfy an organization. Version is covered here. The Kind
refer to reflect.Type
of an Object
, which is extremely useful for handler
to know which type of Object is going to handle. More details about this will be covered later in this post.
To summarize the above points, letting all Object
to imlepment runtime.Object
interface make API server
much simpler:
-
CRUD
of allObject
follows the same business logic, and forAPI server
, allObject
are basically the same. -
Drawbacks can not be avoided, such as using reflect, which may bring to performance issues.
API server
has amap
with key ofkind
string and value of Type to register differentsObject
, for the reason thatAPI server
must know which kind ofObject
it is going to create/update/delete, and which schema is used for serialization/deserialization. Introducing a universal interfaceruntime Object
will definitely lead to this kind o issues. You can see how map of Objects are registered here), and how an Object is convert into an concreted type here -
Also note that the endpoint of
restful API
are dynamic that it consists ofObject
kind
, and therouter
will be registered, which depends on theObject
map registered. -
The advantange is also obvious that, you can introduce CRD to
API server
that define your own schema ofObject
and customize behaviours of it.
By designing API server
in such a concise way, it cut down the development overhead and bring the most facinating features to Kubernetes
.
The API Installer
The API that kube-apiserver
exposed is generated by an install function, which is way different from our experience for relational-database based API server development. Such installer has installed a collection of Route values that bind a Http Method and URL
for kubernete resource
.
It is easy to follow installer.go
source code, using func (a *APIInstaller) Install()
as the skeleton, to find how it use the registered system Object
to generate their RESTful API
with corresponding etcd
storage interface used to do CRUD
action for an given object.
Also know that the servering object
will be add to a list
to be exposed by the discovery handler, letting the other componets know what API Group
the kube-apiserver
are serving. For instance, one of the most critical thing Controller Manager
do is to check API group
registered in kubernetes/apiserver
and kubernetes/apiextensions-apiserver
, check here.
Although handlers for all resources are the same as mentioned, it is important for handler funtion to know which kind of Object
he is handling. kubernetes/apiserver
introduces an efficient type RequestScope to decouple type-to-type varity from handler function, spically for Kind
attribute, which is type of GroupVersionKind
. Thus it is easy to understand here that there are different RequestScope
for different kind of Object
's handler.
For simplicity, the piece of code for installer
uses the registered objects/scheme as input, and generates API
path. Then for each API
path, it registers get, create, update and ...
handler with corresponding RequestScope
of the given Object
The benefis for dynamic API installer are not only about avoding hardcode of the http route
, but the most important one, is to enable a customized API registry machaism that allows user-defined resources to be registered into kube-apiserver
and let it serves your CRD for a varity of predefined handler, see kubernetes-apiextension here. In other word, it enables extensions of kubernetes API
. Note that the handler for both your CRD
and system resource are the same, as mentioned in the above part. The only different for them are the Controller
part covered in this post below.
The API Extension
// how CRD is add, helm install implementation.
// todo
The useful client
Kubernetes/go-client is a individual componet from API server
, but its main purpose is to wrap interaction with API server
, also comes with cache
, auth
and sync
features.
Components like Controller
, Scheduler
and kubectl
interact with API server
through this client
, saving a lot development overhead and make the process of collaborative development more consistent.
Addtionally, it provides facinating features under the hood, like Reflector, Informer and Indexer, which are sophisticated design and machanism. // todo
As we know that performance of etcd
is not its strength, and it may become a bottle neck for a whole system that relies on etcd
as a metadata storage. By introducing the above design, number of requests can be decreased and burden of etcd
storage can be eased. And thus increase kubernetes
horizontal scalability.
It is a good convension to provide a client/sdk
for an API server. It make the implement of other componet in Kubernetes
much easier, especially for Controller
.
The Subresource
It becomes a common problem for developers to implement partial update for object based on a Key/Value
database. In terms of RESTful service convention, we may prefer using PATCH
over PUT
. kubernetes
has introduced the concept called subresource
. For instance, status
is subresource
of Namespace
. You can see the URLparsing function here to get a better understanding of kubernetes API
call.
Here is a kubernetes/client-go implementation update staus of a given namespace . An API call to PUT api/namespace/{$namespaceName}/status
will be sent with the full namespace as request body. Addtionally , A PATCH
method has also been implemented, which is similiar. For kube-apiserver
server side implementation, // todo
The Admission Controller & Admission Webhooks
Admission controller/plugin is acted as a middleware that process requests to the Kubernetes API server (prior to persistence of the object, but after the request is authenticated and authorized
), and a varity of compiled-in admission plugins are provided by kubernetes system. One typical example of admission controller is the LimitRanger , which:
Observes the incoming request and ensure that it does not violate any of the constraints enumerated in the LimitRange object in a Namespace
Its corresponding source code here that is easy to understand, as its workflow is similar to the normal kubernetes Controller.
Besides, the design of admission controller provides a flexible way for k8s developer to extend their customized admission plugin run as admission webhooks at runtime. It’s one of the facinating features that imporves kubernetes extensibility. Kubernetes has defined a protocal for admission webhook, e.g. request and response, webhook configuration, and so on.
One of the most famous implementaions may be istio’s Automatic Sidecar Injection, which enable creating sidecar proxy in the same pod with the deployed service within a given namespace. Check its souce code here
Controller Manager
Kubernetes Controller deals with different kind of Object
and perform different logic to make Object
to be in desirable state. By decoupling the Controller
logic from API server
make development much sophisticated .
Dealing with Object Relationship
There is always an issue that may bothers developers, which is how to design a system that only relies on a NoSQL
database as the backend storage, as there are some object relationship issues, e.g. Pod
is subresource of Deployment
- How to get all
Pods
of aDeployment
? - How to get all
Pending Pod
? - How if I delete
Pod
of aDeployment
?
Since NoSQL
database is not good at handling Object Relationship
as Relational Database
, introducing Controller
helps taking good care of it.
// todo.