【转载】Controller-runtime控制器的实现

介绍

controller-runtime框架实际上是社区帮我们封装的一个控制器处理的框架,底层核心实现原理和我们自定义一个controller控制器逻辑是一样的,只是在这个基础上新增了一些概念,开发者直接使用这个框架去开发控制器会更加简单方便。

包括kubebuilder、operator-sdk这些框架其实都是在controller-runtime基础上做了一层封装,方便开发者快速生成项目的脚手架而已。

下面我们就来分析下controller-runtime是如何实现的控制器处理。

Controller实现

首先我们还是去查看下控制器的定义以及控制器是如何启动的。控制器的定义结构体如下所示:

// pkg/internal/controller/controller.go

// Controller implements controller.Controller.
type Controller struct {
    // Name is used to uniquely identify a Controller in tracing, logging and monitoring.  Name is required.
    // Name 是用于追踪、记录、监控的唯一标识,必填字段
    Name string

    // MaxConcurrentReconciles is the maximum number of concurrent Reconciles which can be run. Defaults to 1.
    // 可以运行的最大并发 Reconciles 数量,默认值为 1
    MaxConcurrentReconciles int

    // Reconciler is a function that can be called at any time with the Name / Namespace of an object and
    // ensures that the state of the system matches the state specified in the object.
    // Defaults to the DefaultReconcileFunc.
    // Reconciler 是一个可以随时调用对象的 Name/Namespace 的函数
    // 确保系统状态与对象状态中指定的状态一致,默认为 DefaultReconcileFunc 函数
    Do reconcile.Reconciler

    // MakeQueue constructs the queue for this controller once the controller is ready to start.
    // This exists because the standard Kubernetes workqueues start themselves immediately, which
    // leads to goroutine leaks if something calls controller.New repeatedly.
    // 一旦控制器准备好启动,MakeQueue就会为此控制器构造队列。
    // 这是因为标准的Kubernetes工作队列会立即启动,如果有东西反复调用controller.New,就会导致goroutine泄漏。
    MakeQueue func() workqueue.RateLimitingInterface

    // Queue is an listeningQueue that listens for events from Informers and adds object keys to
    // the Queue for processing
    // Queue 通过监听来自 Informer 的事件,添加对象键到队列中进行处理
    // MakeQueue 属性就是用来构造这个工作队列的
    Queue workqueue.RateLimitingInterface

    // SetFields is used to inject dependencies into other objects such as Sources, EventHandlers and Predicates
    // Deprecated: the caller should handle injected fields itself.
    // SetFields 用来将依赖关系注入到其他对象,比如 Sources、EventHandlers、Predicates
    // 不推荐:调用者应该自己处理注入的字段
    SetFields func(i interface{}) error

    // mu is used to synchronize Controller setup
    // 控制器同步信号量
    mu sync.Mutex

    // Started is true if the Controller has been Started
    // 控制器是否已经启动
    Started bool

    // ctx is the context that was passed to Start() and used when starting watches.
    //
    // According to the docs, contexts should not be stored in a struct: https://golang.org/pkg/context,
    // while we usually always strive to follow best practices, we consider this a legacy case and it should
    // undergo a major refactoring and redesign to allow for context to not be stored in a struct.
    // ctx 是传递给 Start() 并在启动 watch 时候的上下文
    ctx context.Context

    // CacheSyncTimeout refers to the time limit set on waiting for cache to sync
    // Defaults to 2 minutes if not set.
    // CacheSyncTimeout 指等待缓存同步的时间限制
    // 如果没有设置,默认两分钟
    CacheSyncTimeout time.Duration

    // startWatches maintains a list of sources, handlers, and predicates to start when the controller is started.
    // startWatches 维护了一个 sources、handlers 以及 predicates 列表以方便在控制器启动的时候启动
    startWatches []watchDescription

    // LogConstructor is used to construct a logger to then log messages to users during reconciliation,
    // or for example when a watch is started.
    // Note: LogConstructor has to be able to handle nil requests as we are also using it
    // outside the context of a reconciliation.
    LogConstructor func(request *reconcile.Request) logr.Logger

    // RecoverPanic indicates whether the panic caused by reconcile should be recovered.
    // RecoverPanic 标识是否恢复由 reconcile 引发的 panic
    RecoverPanic bool
}

上面的结构体就是 controller-runtime 中定义的控制器结构体,我们可以看到结构体中仍然有一个限速的工作队列,但是看上去没有资源对象的 Informer 或者 Indexer 的数据,实际上这里是通过 startWatches 属性做了一层封装,该属性是一个 watchDescription 队列,一个 watchDescription 包含了所有需要 watch 的信息:

// pkg/internal/controller/controller.go

// watchDescription contains all the information necessary to start a watch.
// watchDescription 包含所有启动 watch 操作所需的信息
type watchDescription struct {
    src        source.Source
    handler    handler.EventHandler
    predicates []predicate.Predicate
}

整个控制器中足重要的两个函数就是 Watch 和 Start,下面我们就来分析下它们是如何实现的。

Watch 函数实现

// pkg/internal/controller/controller.go

// Watch implements controller.Controller.
func (c *Controller) Watch(src source.Source, evthdler handler.EventHandler, prct ...predicate.Predicate) error {
    c.mu.Lock()
    defer c.mu.Unlock()

    // Inject Cache into arguments
    // 注入 Cache 到参数中
    if err := c.SetFields(src); err != nil {
        return err
    }
    if err := c.SetFields(evthdler); err != nil {
        return err
    }
    for _, pr := range prct {
        if err := c.SetFields(pr); err != nil {
            return err
        }
    }

    // Controller hasn't started yet, store the watches locally and return.
    // Controller 还没有启动,把 watches 存放到本地然后返回
    //
    // These watches are going to be held on the controller struct until the manager or user calls Start(...).
    // 这些 watches 会被保存到控制器结构体中,直到调用 Start(...)函数
    if !c.Started {
        c.startWatches = append(c.startWatches, watchDescription{src: src, handler: evthdler, predicates: prct})
        return nil
    }

    c.LogConstructor(nil).Info("Starting EventSource", "source", src)
    // 调用 src 的 Start 函数
    return src.Start(c.ctx, evthdler, c.Queue, prct...)
}

上面的Watch函数可以看到最终是去调用Source这个参数的Start函数。

Source是事件的源,如对资源对象进行 Create、Update、Delete操作,需要由event.EventHandlersreconcile.Requests入队列进行处理。

  • 使用 Kind 来处理来自集群的事件(如Pod创建、Pod更新、Deployment更新)
  • 使用 Channel 来处理来自集群外部的事件(如Github Webhook回调、轮训外部URL)
// pkg/source/source.go

type Source interface {
	// Start is internal and should be called only by the Controller to register an EventHandler with the Informer
	// to enqueue reconcile.Requests.
  
  // Start 是一个内部函数
	// 只应该由Controller 调用,向 Informer 注册一个 EventHandler
	// 将 reconcile.Request 放入队列
	Start(context.Context, handler.EventHandler, workqueue.RateLimitingInterface, ...predicate.Predicate) error
}

我们可以看到source.Source是一个接口,它是Controller.Watch的一个参数,所以要看具体的看Source.Start函数是如何实现的,我们需要去看传入Controller.Watch的参数,在controller-runtime中调用控制器的Watch函数的入口实际上位于pkg/builder/controller.go文件的doWatch()函数:

// pkg/builder/controller.go

func (blder *Builder) doWatch() error {
	// Reconcile type
	src := &source.Kind{Type: blder.forInput.object}
	hdler := &handler.EnqueueRequestForObject{}
	allPredicates := append(blder.globalPredicates, blder.forInput.predicates...)
	err := blder.ctrl.Watch(src, hdler, allPredicates...)
	if err != nil {
		return err
	}
  ......
	return nil
}

可以看到Watch的第一个变量是一个source.Kind的类型,该结构体就实现了上面的source.Source接口:

// pkg/source/source.go

// Kind is used to provide a source of events originating inside the cluster from Watches (e.g. Pod Create).
// Kind 用于提供来自集群内部的事件源,这些事件来自于 Watches(例如 Pod Create 事件)
type Kind struct {
    // Type is the type of object to watch.  e.g. &v1.Pod{}
    // Type 是 watch 对象的类型,比如 &v1.Pod{}
    Type client.Object

    // cache used to watch APIs
    // cache 用于 watch 的 APIs 接口
    cache cache.Cache

    // started may contain an error if one was encountered during startup. If its closed and does not
    // contain an error, startup and syncing finished.
    // 如果在启动过程中遇到错误,started可能会包含错误。
    // 如果其已关闭且不包含错误,则启动和同步已完成。
    started     chan error
    startCancel func()
}

// Start is internal and should be called only by the Controller to register an EventHandler with the Informer
// to enqueue reconcile.Requests.
// Start 是内部的,只应由控制器调用,以便向Informer 注册 EventHandler,使其进入 reconcile.Requests的队列
func (ks *Kind) Start(ctx context.Context, handler handler.EventHandler, queue workqueue.RateLimitingInterface,
    prct ...predicate.Predicate) error {
    // Type should have been specified by the user.
    // Type 在使用之前必须提前指定
    if ks.Type == nil {
        return fmt.Errorf("must specify Kind.Type")
    }

    // cache should have been injected before Start was called
    // cache 也是需要在调用 Start 之前被注入
    if ks.cache == nil {
        return fmt.Errorf("must call CacheInto on Kind before calling Start")
    }

    // cache.GetInformer will block until its context is cancelled if the cache was already started and it can not
    // sync that informer (most commonly due to RBAC issues).
    // 如果缓存已启动且无法同步该 informer(通常是由于RBAC问题),则将一直阻止,直到其上下文被取消。
    ctx, ks.startCancel = context.WithCancel(ctx)
    ks.started = make(chan error)
    go func() {
        var (
            i       cache.Informer
            lastErr error
        )

        // Tries to get an informer until it returns true,
        // an error or the specified context is cancelled or expired.
        // 尝试获取 informer 直到其返回true、返回错误、ctx 被取消或者过期
        if err := wait.PollImmediateUntilWithContext(ctx, 10*time.Second, func(ctx context.Context) (bool, error) {
            // Lookup the Informer from the Cache and add an EventHandler which populates the Queue
            // 从 Cache 中获取 Informer,并添加一个事件处理程序到队列
            i, lastErr = ks.cache.GetInformer(ctx, ks.Type)
            if lastErr != nil {
                kindMatchErr := &meta.NoKindMatchError{}
                switch {
                case errors.As(lastErr, &kindMatchErr):
                    log.Error(lastErr, "if kind is a CRD, it should be installed before calling Start",
                        "kind", kindMatchErr.GroupKind)
                case runtime.IsNotRegisteredError(lastErr):
                    log.Error(lastErr, "kind must be registered to the Scheme")
                default:
                    log.Error(lastErr, "failed to get informer from cache")
                }
                return false, nil // Retry.
            }
            return true, nil
        }); err != nil {
            if lastErr != nil {
                ks.started <- fmt.Errorf("failed to get informer from cache: %w", lastErr)
                return
            }
            ks.started <- err
            return
        }

        i.AddEventHandler(internal.EventHandler{Queue: queue, EventHandler: handler, Predicates: prct})
        if !ks.cache.WaitForCacheSync(ctx) {
            // Would be great to return something more informative here
            ks.started <- errors.New("cache did not sync")
        }
        close(ks.started)
    }()

    return nil
}

从上面的具体实现我们就可以看出来 Controller.Watch 函数就是实现的获取资源对象的 Informer 以及注册事件监听函数。

Informer 是通过 cache 获取的,cache 是在调用 Start 函数之前注入进来的,这里其实我们不用太关心;

下面的 AddEventHandler 函数中是一个 internal.EventHandler 结构体,那这个结构体比如会实现 client-go 中提供的 ResourceEventHandler 接口,也就是我们熟悉的 OnAdd、OnUpdate、OnDelete 几个函数:

// pkg/source/internal/eventsource.go

// EventHandler adapts a handler.EventHandler interface to a cache.ResourceEventHandler interface.
// EventHandler 实现了 cache.ResourceEventHandler 接口
type EventHandler struct {
    EventHandler handler.EventHandler
    Queue        workqueue.RateLimitingInterface
    Predicates   []predicate.Predicate
}

// OnAdd creates CreateEvent and calls Create on EventHandler.
// OnAdd创建CreateEvent并在EventHandler上调用Create。
func (e EventHandler) OnAdd(obj interface{}) {
    // kubernetes 对象被创建的事件
    c := event.CreateEvent{}

    // Pull Object out of the object
    // 断言 runtime.Object
    if o, ok := obj.(client.Object); ok {
        c.Object = o
    } else {
        log.Error(nil, "OnAdd missing Object",
            "object", obj, "type", fmt.Sprintf("%T", obj))
        return
    }

    // Predicates 用于事件过滤,循环调用 Predicates 的 Create 函数
    for _, p := range e.Predicates {
        if !p.Create(c) {
            return
        }
    }

    // Invoke create handler
    // 调用 EventHandler 的 Create 函数
    e.EventHandler.Create(c, e.Queue)
}

// OnUpdate creates UpdateEvent and calls Update on EventHandler.
// OnUpdate 创建UpdateEvent并在EventHandler上调用Update。
func (e EventHandler) OnUpdate(oldObj, newObj interface{}) {
    // 更新事件
    u := event.UpdateEvent{}

    if o, ok := oldObj.(client.Object); ok {
        u.ObjectOld = o
    } else {
        log.Error(nil, "OnUpdate missing ObjectOld",
            "object", oldObj, "type", fmt.Sprintf("%T", oldObj))
        return
    }

    // Pull Object out of the object
    if o, ok := newObj.(client.Object); ok {
        u.ObjectNew = o
    } else {
        log.Error(nil, "OnUpdate missing ObjectNew",
            "object", newObj, "type", fmt.Sprintf("%T", newObj))
        return
    }

    for _, p := range e.Predicates {
        if !p.Update(u) {
            return
        }
    }

    // Invoke update handler
    e.EventHandler.Update(u, e.Queue)
}

// OnDelete creates DeleteEvent and calls Delete on EventHandler.
func (e EventHandler) OnDelete(obj interface{}) {
    d := event.DeleteEvent{}

    // Deal with tombstone events by pulling the object out.  Tombstone events wrap the object in a
    // DeleteFinalStateUnknown struct, so the object needs to be pulled out.
    // Copied from sample-controller
    // This should never happen if we aren't missing events, which we have concluded that we are not
    // and made decisions off of this belief.  Maybe this shouldn't be here?
    var ok bool
    if _, ok = obj.(client.Object); !ok {
        // If the object doesn't have Metadata, assume it is a tombstone object of type DeletedFinalStateUnknown
        // 假设对象没有 Metadata,假设是一个 DeletedFinalStateUnknown 类型的对象
        tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
        if !ok {
            log.Error(nil, "Error decoding objects.  Expected cache.DeletedFinalStateUnknown",
                "type", fmt.Sprintf("%T", obj),
                "object", obj)
            return
        }

        // Set obj to the tombstone obj
        obj = tombstone.Obj
    }

    // Pull Object out of the object
    if o, ok := obj.(client.Object); ok {
        d.Object = o
    } else {
        log.Error(nil, "OnDelete missing Object",
            "object", obj, "type", fmt.Sprintf("%T", obj))
        return
    }

    for _, p := range e.Predicates {
        if !p.Delete(d) {
            return
        }
    }

    // Invoke delete handler
    e.EventHandler.Delete(d, e.Queue)
}

上面的EventHandler结构体实现了client-go中的ResourceEventHandler接口,实现过程中我们可以看到调用了Predicates进行事件过滤,过滤后才是真正的事件处理,不过真正的事件处理也不是在这里实现的,而是通过Controller.Watch函数传递进来的handler.EventHandler处理的,这个函数通过前面的doWatch()函数可以看出它是一个&handler.EnqueueRequestForObject{}对象,所以真正的事件处理逻辑是这个函数去实现的:

// pkg/handler/enqueue.go

// EnqueueRequestForObject enqueues a Request containing the Name and Namespace of the object that is the source of the Event.
// (e.g. the created / deleted / updated objects Name and Namespace).  handler.EnqueueRequestForObject is used by almost all
// Controllers that have associated Resources (e.g. CRDs) to reconcile the associated Resource.
// EnqueueRequestForObject 是一个包含了作为事件源的对象的 Name 和 Namespace 的入队列的 Request
//(例如,created/deleted/updated 对象的 Name 和 Namespace)
// handler.EnqueueRequestForObject 几乎被所有关联资源(如 CRD)的控制器使用,以协调关联的资源
type EnqueueRequestForObject struct{}


// Create implements EventHandler.
// Create 函数的实现
func (e *EnqueueRequestForObject) Create(evt event.CreateEvent, q workqueue.RateLimitingInterface) {
    if evt.Object == nil {
        enqueueLog.Error(nil, "CreateEvent received with no metadata", "event", evt)
        return
    }
    // 添加一个 Request 对象到工作队列
    q.Add(reconcile.Request{NamespacedName: types.NamespacedName{
        Name:      evt.Object.GetName(),
        Namespace: evt.Object.GetNamespace(),
    }})
}

// Update implements EventHandler.
// Update 函数实现
func (e *EnqueueRequestForObject) Update(evt event.UpdateEvent, q workqueue.RateLimitingInterface) {
    switch {
    // 如果新的对象不为空,添加到工作队列中
    case evt.ObjectNew != nil:
        q.Add(reconcile.Request{NamespacedName: types.NamespacedName{
            Name:      evt.ObjectNew.GetName(),
            Namespace: evt.ObjectNew.GetNamespace(),
        }})
    // 如果旧的对象存在,添加到工作队列中
    case evt.ObjectOld != nil:
        q.Add(reconcile.Request{NamespacedName: types.NamespacedName{
            Name:      evt.ObjectOld.GetName(),
            Namespace: evt.ObjectOld.GetNamespace(),
        }})
    default:
        enqueueLog.Error(nil, "UpdateEvent received with no metadata", "event", evt)
    }
}

// Delete implements EventHandler.
// Delete 函数的实现
func (e *EnqueueRequestForObject) Delete(evt event.DeleteEvent, q workqueue.RateLimitingInterface) {
    if evt.Object == nil {
        enqueueLog.Error(nil, "DeleteEvent received with no metadata", "event", evt)
        return
    }
    // 因为前面关于对象的删除状态已经处理了,所以这里直接放入队列中即可
    q.Add(reconcile.Request{NamespacedName: types.NamespacedName{
        Name:      evt.Object.GetName(),
        Namespace: evt.Object.GetNamespace(),
    }})
}

通过 EnqueueRequestForObject 的 Create/Update/Delete 实现可以看出我们放入到工作队列中的元素不是以前默认的元素唯一的 KEY,而是经过封装的 reconcile.Request 对象,当然通过这个对象也可以很方便获取对象的唯一标识 KEY。

image

总结起来就是 Controller.Watch 函数就是来实现之前自定义控制器中的 Informer 初始化以及事件监听函数的注册。

Start 函数的实现

上面我们分析了控制器的 Watch 函数的实现,下面我们来分析另外一个重要的函数Controller.Start函数的实现。

// pkg/internal/controller/controller.go

// Start implements controller.Controller.
func (c *Controller) Start(ctx context.Context) error {
    // use an IIFE to get proper lock handling
    // but lock outside to get proper handling of the queue shutdown
    c.mu.Lock()
    // 先判断控制器是否已经启动了,如果是直接返回错误
    if c.Started {
        return errors.New("controller was started more than once. This is likely to be caused by being added to a manager multiple times")
    }

    c.initMetrics()

    // Set the internal context.
    // 设置内部的 ctx
    c.ctx = ctx

    // 调用 MakeQueue()函数生成工作队列
    c.Queue = c.MakeQueue()
    go func() {
        <-ctx.Done()
        c.Queue.ShutDown()
    }()

    wg := &sync.WaitGroup{}
    err := func() error {
        defer c.mu.Unlock()

        // TODO(pwittrock): Reconsider HandleCrash
        defer utilruntime.HandleCrash()

        // NB(directxman12): launch the sources *before* trying to wait for the
        // caches to sync so that they have a chance to register their intendeded
        // caches.
        // NB(directxman12): 在试图等待缓存同步之前启动 sources
        // 这样它们有机会注册它们的目标缓存
        for _, watch := range c.startWatches {
            c.LogConstructor(nil).Info("Starting EventSource", "source", fmt.Sprintf("%s", watch.src))

            if err := watch.src.Start(ctx, watch.handler, c.Queue, watch.predicates...); err != nil {
                return err
            }
        }

        // Start the SharedIndexInformer factories to begin populating the SharedIndexInformer caches
        // 启动 SharedIndexInformer 工厂,开始填充 SharedIndexInformer 缓存
        c.LogConstructor(nil).Info("Starting Controller")

        for _, watch := range c.startWatches {
            syncingSource, ok := watch.src.(source.SyncingSource)
            if !ok {
                continue
            }

            if err := func() error {
                // use a context with timeout for launching sources and syncing caches.
                sourceStartCtx, cancel := context.WithTimeout(ctx, c.CacheSyncTimeout)
                defer cancel()

                // WaitForSync waits for a definitive timeout, and returns if there
                // is an error or a timeout
                // 等待 Informer 同步完成
                if err := syncingSource.WaitForSync(sourceStartCtx); err != nil {
                    err := fmt.Errorf("failed to wait for %s caches to sync: %w", c.Name, err)
                    c.LogConstructor(nil).Error(err, "Could not wait for Cache to sync")
                    return err
                }

                return nil
            }(); err != nil {
                return err
            }
        }

        // All the watches have been started, we can reset the local slice.
        //
        // We should never hold watches more than necessary, each watch source can hold a backing cache,
        // which won't be garbage collected if we hold a reference to it.
        // 所有的 watches 已经重启,充值
        c.startWatches = nil

        // Launch workers to process resources
       // 启动 workers 来处理资源
        c.LogConstructor(nil).Info("Starting workers", "worker count", c.MaxConcurrentReconciles)
        wg.Add(c.MaxConcurrentReconciles)
        for i := 0; i < c.MaxConcurrentReconciles; i++ {
            go func() {
                defer wg.Done()
                // Run a worker thread that just dequeues items, processes them, and marks them done.
                // It enforces that the reconcileHandler is never invoked concurrently with the same object.
                for c.processNextWorkItem(ctx) {
                }
            }()
        }

        c.Started = true
        return nil
    }()
    if err != nil {
        return err
    }

    <-ctx.Done()
    c.LogConstructor(nil).Info("Shutdown signal received, waiting for all workers to finish")
    wg.Wait()
    c.LogConstructor(nil).Info("All workers finished")
    return nil
}

Start函数和自定义控制器中的启动循环比较类似,都是先等待资源对象的informer同步完成,然后启动workers来处理资源对象,而且worker函数都是一样的实现方式:

// pkg/internal/controller/controller.go

// processNextWorkItem will read a single work item off the workqueue and
// attempt to process it, by calling the reconcileHandler.
// processNextWorkItem将通过调用reconcileHandler从工作队列中读取单个工作项,并尝试对其进行处理。
func (c *Controller) processNextWorkItem(ctx context.Context) bool {
    // 从队列中弹出元素
    obj, shutdown := c.Queue.Get()
    if shutdown {
        // Stop working
        // 队列关闭了,直接返回 false
        return false
    }

    // We call Done here so the workqueue knows we have finished
    // processing this item. We also must remember to call Forget if we
    // do not want this work item being re-queued. For example, we do
    // not call Forget if a transient error occurs, instead the item is
    // put back on the workqueue and attempted again after a back-off
    // period.

    // 我们在此处调用Done,以便工作队列知道我们已完成对该项的处理。
    // 如果不希望此工作项重新排队,我们还必须记住调用“Forget”。
    // 例如,如果发生暂时性错误,我们不会调用“Forget”,而是将该项放回工作队列,并在一段退避期后再次尝试。
    defer c.Queue.Done(obj)

    ctrlmetrics.ActiveWorkers.WithLabelValues(c.Name).Add(1)
    defer ctrlmetrics.ActiveWorkers.WithLabelValues(c.Name).Add(-1)

    // 调用 reconcileHandler 进行元素处理
    c.reconcileHandler(ctx, obj)
    return true
}


func (c *Controller) reconcileHandler(ctx context.Context, obj interface{}) {
    // Update metrics after processing each item
    // 处理完每个元素后更新指标
    reconcileStartTS := time.Now()
    defer func() {
        c.updateMetrics(time.Since(reconcileStartTS))
    }()

    // Make sure that the object is a valid request.
    // 确保对象是一个有效的 request 对象
    req, ok := obj.(reconcile.Request)
    if !ok {
        // As the item in the workqueue is actually invalid, we call
        // Forget here else we'd go into a loop of attempting to
        // process a work item that is invalid.
        // 工作队列中的元素无效,则调用 Forget 函数
        // 欧泽会进入一个循环尝试处理一个无效的元素
        c.Queue.Forget(obj)
        c.LogConstructor(nil).Error(nil, "Queue item was not a Request", "type", fmt.Sprintf("%T", obj), "value", obj)
        // Return true, don't take a break
        return
    }

    log := c.LogConstructor(&req)

    log = log.WithValues("reconcileID", uuid.NewUUID())
    ctx = logf.IntoContext(ctx, log)

    // RunInformersAndControllers the syncHandler, passing it the Namespace/Name string of the
    // resource to be synced.
    // RunInformersAndControllers 的 syncHandler,传递给它要同步的资源的 namespace/name 的字符串
    // 调用Reconciler 函数来处理这个元素。也就是我们真正去编写业务逻辑的地方
    result, err := c.Reconcile(ctx, req)
    switch {
    case err != nil:
        // 如果业务逻辑处理出错,则重新添加到限速队列中
        c.Queue.AddRateLimited(req)
        // Metrics 指标记录
        ctrlmetrics.ReconcileErrors.WithLabelValues(c.Name).Inc()
        ctrlmetrics.ReconcileTotal.WithLabelValues(c.Name, labelError).Inc()
        log.Error(err, "Reconciler error")

    // 如果调协函数 Reconcile 处理结果中包含大于0的RequestAfter
    case result.RequeueAfter > 0:
        // The result.RequeueAfter request will be lost, if it is returned
        // along with a non-nil error. But this is intended as
        // We need to drive to stable reconcile loops before queuing due
        // to result.RequestAfter

        // 需要注意如果 result.RequeueAfter 与一个非 nil 的错误一起返回,则 result.RequeueAfter会丢失
        // 忘记元素
        c.Queue.Forget(obj)
        // 加入队列
        c.Queue.AddAfter(req, result.RequeueAfter)
        ctrlmetrics.ReconcileTotal.WithLabelValues(c.Name, labelRequeueAfter).Inc()
    case result.Requeue:
        // 加入队列
        c.Queue.AddRateLimited(req)
        ctrlmetrics.ReconcileTotal.WithLabelValues(c.Name, labelRequeue).Inc()
    default:
        // Finally, if no error occurs we Forget this item so it does not
        // get queued again until another change happens.
        // 最后如果没有发生错误,我们就会 Forget 这个元素
        // 这样直到发送另一个变化它就不会再被排队了
        c.Queue.Forget(obj)
        ctrlmetrics.ReconcileTotal.WithLabelValues(c.Name, labelSuccess).Inc()
    }
}

上面的reconcileHandler函数就是我们真正执行元素业务处理的地方,函数中包含了事件处理以及错误处理,真正的事件处理是通过c.Do.Reconcile(req)暴露给开发者的,所以对于开发者来说,只需要在Reconcile函数中去处理业务逻辑就可以了。

根据c.Do.Reconcile(req)函数的返回值来判断是否将元素重新加入队列中进行处理:

  • 如果返回 error 错误,则将元素重新添加到限速队列中
  • 如果返回的result.RequeueAfter>0,则先将元素忘记,然后在result.RequeueAfter时间后加入队列中
  • 如果返回result.Requeue,则直接将元素重新加入到限速队列中
  • 如果正常返回,则忘记这个元素
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章