4.深入Istio源碼：Pilot的Discovery Server如何執行xDS異步分發

轉載請聲明出處哦~，本篇文章發佈於luozhiyun的博客：https://www.luozhiyun.com

本文使用的Istio源碼是 release 1.5。

介紹

Discovery Service主要爲數據面（運行在 sidecar 中的 Envoy 等 proxy 組件）提供控制信息服務。Discovery Service爲數據面提供的信息叫做xds ，這裏的 x 是一個代詞，在 Istio 中，xds 包括 cds(cluster discovery service)、lds(listener discovery service)、rds(route discovery service)、eds(endpoint discovery service)，而 ads(aggregated discovery service) 是對這些服務的一個統一封裝。

Discovery Service中主要包含下述邏輯：

啓動GRPC Server並接收來自Envoy端的連接請求；
接收Envoy端的xDS請求，從Config Controller和Service Controller中獲取配置和服務信息，生成響應消息發送給Envoy；
監聽來自Config Controller的配置變化和來自Service Controller的服務變化消息，並將配置和服務變化內容通過xDS接口推送到Envoy。

Discovery Service初始化

從上面的流程圖可以知道，在調用NewServer創建XdsServer的時候會做很多初始化的工作。如初始化Pilot Server、網格初始化、初始化Istio Config的控制器、初始化Service Discovery的控制器等。我們下面列出和Discovery Service初始化相關的代碼：

func NewServer(args *PilotArgs) (*Server, error) {
	//創建Pilot Server
	s := &Server{
		basePort:       args.BasePort,
		clusterID:      getClusterID(args),
		environment:    e,
		//初始化XdsServer
		EnvoyXdsServer: envoyv2.NewDiscoveryServer(e, args.Plugins),
		forceStop:      args.ForceStop,
		mux:            http.NewServeMux(),
	}
	...
	//初始化xDS服務端
	if err := s.initDiscoveryService(args); err != nil {
		return nil, fmt.Errorf("discovery service: %v", err)
	}
	...
}

從上面的代碼可以看出XdsServer是通過調用NewDiscoveryServer方法來進行初始化的，返回的是一個DiscoveryServer實例，具體字段的使用會在後面說到。

type DiscoveryServer struct {
	...
	// Endpoint 的緩存，以服務名和 namespace 作爲索引，主要用於 EDS 更新
	EndpointShardsByService map[string]map[string]*EndpointShards
	//統一接收其他組件發來的 PushRequest 的 channel
	pushChannel chan *model.PushRequest 
	updateMutex sync.RWMutex 
	//pushQueue 主要是在真正 xDS 推送前做防抖緩存
	pushQueue *PushQueue  
}

創建完Server後會調用initDiscoveryService方法：

func (s *Server) initDiscoveryService(args *PilotArgs) error {
	...
	//初始化Service Controller和Config Controller的Handler，用於informer回調
	if err := s.initEventHandlers(); err != nil {
		return err
	}
	...
	// 會在初始化完畢之後調用Start方法，啓動XdsServer
	s.addStartFunc(func(stop <-chan struct{}) error {
		s.EnvoyXdsServer.Start(stop)
		return nil
	}) 
	//初始化Grpc Server服務，並註冊到XdsServer中
	s.initGrpcServer(args.KeepaliveOptions)
	s.httpServer = &http.Server{
		Addr:    args.DiscoveryOptions.HTTPAddr,
		Handler: s.mux,
	}
	...
}

這個方法主要做了這麼幾件事：

初始化各種回調處理器；
將XdsServer啓動函數添加到Server的startFuncs隊列中，會在初始化完畢之後調用；
調用initGrpcServer方法初始化Grpc Server服務，並註冊到XdsServer中。

在初始化 grpcServer 的時候，調用了 DiscoveryServer.Register() 方法，向 grpcServer 註冊了以下幾個服務：

func (s *DiscoveryServer) Register(rpcs *grpc.Server) {
    //註冊的時候傳入grpc server 和 DiscoveryServer
	ads.RegisterAggregatedDiscoveryServiceServer(rpcs, s)
}

DiscoveryServer實際上實現了AggregatedDiscoveryServiceServer接口：

type AggregatedDiscoveryServiceServer interface {
	 // 全量 ADS Stream 接口
	StreamAggregatedResources(AggregatedDiscoveryService_StreamAggregatedResourcesServer) error
	// 增量 ADS Stream 接口
	DeltaAggregatedResources(AggregatedDiscoveryService_DeltaAggregatedResourcesServer) error
}
}

StreamAggregatedResources 接收DiscoveryRequest ，返回 DiscoveryResponse 流，包含全量的 xDS 數據，DeltaAggregatedResources方法目前沒有具體實現。

大致調用流程如下：

Discovery Service啓動

discoveryServer.Start方法還是在pilot discovery的main方法中被調用。main方法會在調用完bootstrap.NewServer方法後，進行Start方法的調用：

discoveryCmd = &cobra.Command{
	...
	RunE: func(c *cobra.Command, args []string) error {
		...
		stop := make(chan struct{})
		// 創建xDs服務器
		discoveryServer, err := bootstrap.NewServer(&serverArgs)
		if err != nil {
			return fmt.Errorf("failed to create discovery service: %v", err)
		}
 
		// 啓動服務器
		if err := discoveryServer.Start(stop); err != nil {
			return fmt.Errorf("failed to start discovery service: %v", err)
		} 
		...
		return nil
	},
}

在調用Start方法的時候會獲取到Server的startFuncs集合，然後依次執行裏面設置的函數：

func (s *Server) Start(stop <-chan struct{}) error {
	// Now start all of the components.
	for _, fn := range s.startFuncs {
		if err := fn(stop); err != nil {
			return err
		}
	}
	...
}

遍歷調用完畢後會分別啓動server Controller和config Controller的run函數，以及調用xdsServer的Start方法，Start方法主要分別啓動了三個線程：

func (s *DiscoveryServer) Start(stopCh <-chan struct{}) {
	adsLog.Infof("Starting ADS server")
	go s.handleUpdates(stopCh)
	go s.periodicRefreshMetrics(stopCh)
	go s.sendPushes(stopCh)
}

比較重要的是handleUpdates方法和sendPushes方法。

handleUpdates方法主要是處理 pushChannel 中收到的推送請求，最後會調用startPush將數據推入到DiscoveryServer的pushQueue管道中；sendPushes方法則是獲取pushQueue管道中的數據，封裝成XdsEvent推入到XdsConnection的pushChannel進行異步處理。

handleUpdates

func (s *DiscoveryServer) handleUpdates(stopCh <-chan struct{}) {
	debounce(s.pushChannel, stopCh, s.Push)
}

func debounce(ch chan *model.PushRequest, stopCh <-chan struct{}, pushFn func(req *model.PushRequest)) {
	var timeChan <-chan time.Time
	var startDebounce time.Time
	var lastConfigUpdateTime time.Time

	pushCounter := 0
	debouncedEvents := 0
 
	var req *model.PushRequest

	free := true
	freeCh := make(chan struct{}, 1)

	push := func(req *model.PushRequest) {
		pushFn(req)
		freeCh <- struct{}{}
	}

	pushWorker := func() {
		eventDelay := time.Since(startDebounce)
		quietTime := time.Since(lastConfigUpdateTime) 
		// debounceMax爲10s ，debounceAfter爲100毫秒
		//延遲時間大於等於最大延遲時間 或者 靜默時間大於等於最小靜默時間
		if eventDelay >= debounceMax || quietTime >= debounceAfter {
			if req != nil {
				pushCounter++
				adsLog.Infof("Push debounce stable[%d] %d: %v since last change, %v since last push, full=%v",
					pushCounter, debouncedEvents,
					quietTime, eventDelay, req.Full)

				free = false
				go push(req)
				req = nil
				debouncedEvents = 0
			}
		} else {
			timeChan = time.After(debounceAfter - quietTime)
		}
	}

	for {
		select {
		case <-freeCh:
			free = true
			pushWorker()
		case r := <-ch:
			// If reason is not set, record it as an unknown reason
			if len(r.Reason) == 0 {
				r.Reason = []model.TriggerReason{model.UnknownTrigger}
			}
			if !enableEDSDebounce && !r.Full {
				// trigger push now, just for EDS
				go pushFn(r)
				continue
			}

			lastConfigUpdateTime = time.Now()
			//首次進入會調用延時器 timeChan 先延遲一個最小靜默時間（100 毫秒）
			if debouncedEvents == 0 {
				timeChan = time.After(debounceAfter)
				startDebounce = lastConfigUpdateTime
			}
			debouncedEvents++
			//合併請求
			req = req.Merge(r)
		case <-timeChan:
			if free {
				pushWorker()
			}
		case <-stopCh:
			return
		}
	}
}

handleUpdates是直接調用了debounce方法，並將pushChannel以及DiscoveryServer的Push函數傳入內。

debounce這個方法裏面的處理非常的有意思，我們下面來講一下它的一個執行流程：

進入到這個方法的時候，pushWorker函數以及push函數都不會被立即調用，而是會走到一個for循環中，裏面有select執行語句，這個for循環會一直等待，直到ch有數據case r := <-ch被執行；
首次進入到case r := <-ch代碼塊的時候，debouncedEvents是等於0的，那麼會直接調用time.After等待debounceAfter設置的時間，也就是100毫秒，被喚醒之後會將timeChan設值，並執行合併請求；
第二次循環的時候會執行到case <-timeChan這塊邏輯中，執行pushWorker函數，在函數裏面會判斷是否等待超過了最大延遲時間debounceMax（10s）或靜默時間超過了debounceAfter（100ms），如果是的話，那麼執行push函數，調用pushFn進行推送，並將freeCh設置一個空的結構體；
下次循環的時候會執行到case <-freeCh:這塊邏輯中，再執行下次的pushWorker操作；

push方法會一直往下調用，直到把數據推入到DiscoveryServer的pushQueue管道中：

send Pushes

func (s *DiscoveryServer) sendPushes(stopCh <-chan struct{}) {
	doSendPushes(stopCh, s.concurrentPushLimit, s.pushQueue)
}

sendPushes會調用doSendPushes方法傳入PushQueue，以及concurrentPushLimit，它是由環境變量 PILOT_PUSH_THROTTLE 控制的，默認爲 100 。

func doSendPushes(stopCh <-chan struct{}, semaphore chan struct{}, queue *PushQueue) {
	for {
		select {
		case <-stopCh:
			return
		default: 
			// 這裏semaphore容量只有100，用來控制速率
			semaphore <- struct{}{}

			// Get the next proxy to push. This will block if there are no updates required.
			client, info := queue.Dequeue()
			recordPushTriggers(info.Reason...)
			// Signals that a push is done by reading from the semaphore, allowing another send on it.
			doneFunc := func() {
				queue.MarkDone(client)
				<-semaphore
			}

			proxiesQueueTime.Record(time.Since(info.Start).Seconds())

			go func() {
				edsUpdates := info.EdsUpdates
				if info.Full {
					// Setting this to nil will trigger a full push
					edsUpdates = nil
				}

				select {
				case client.pushChannel <- &XdsEvent{
					push:               info.Push,
					edsUpdatedServices: edsUpdates,
					done:               doneFunc,
					start:              info.Start,
					namespacesUpdated:  info.NamespacesUpdated,
					configTypesUpdated: info.ConfigTypesUpdated,
					noncePrefix:        info.Push.Version,
				}:
					return
				case <-client.stream.Context().Done(): // grpc stream was closed
					doneFunc()
					adsLog.Infof("Client closed connection %v", client.ConID)
				}
			}()
		}
	}
}

在doSendPushes方法內啓動了一個無限循環，在default代碼塊中實現了主要的功能邏輯。semaphore參數可以看出是用來控制速率用的，當semaphore滿了之後會阻塞。然後會啓動一個線程將XdsEvent初始化後放入到pushChannel中。

總體來說流程如下：

從pushQueue出隊一個xdsConnection；
然後初始化一個XdsEvent入隊到xdsConnection的pushChannel管道中；

這裏放入到pushChannel管道中的消息會在StreamAggregatedResources方法中被處理：

func (s *DiscoveryServer) StreamAggregatedResources(stream ads.AggregatedDiscoveryService_StreamAggregatedResourcesServer) error {
	...
	con := newXdsConnection(peerAddr, stream)
 
	var receiveError error
	reqChannel := make(chan *xdsapi.DiscoveryRequest, 1)
	//從XdsConnection中接收來自Envoy的DiscoveryRequest
	go receiveThread(con, reqChannel, &receiveError)

	for { 
		select {
		//reqChannel處理部分
		case discReq, ok := <-reqChannel:
			...
		//pushChannel處理部分
		case pushEv := <-con.pushChannel: 

			err := s.pushConnection(con, pushEv)
			pushEv.done()
			if err != nil {
				return nil
			}
		}
	}
}

這裏總體來說分爲兩部分，一個是 reqChannel的數據處理這部分稍放到Client Request中說，另一部分是pushChannel的數據處理。

在獲取到pushChannel管道的數據後會調用pushConnection進行處理。

func (s *DiscoveryServer) pushConnection(con *XdsConnection, pushEv *XdsEvent) error { 
	//處理增量推送 EDS 的情況
	if pushEv.edsUpdatedServices != nil {
		if !ProxyNeedsPush(con.node, pushEv) {
			adsLog.Debugf("Skipping EDS push to %v, no updates required", con.ConID)
			return nil
		} 
		if len(con.Clusters) > 0 {
			if err := s.pushEds(pushEv.push, con, versionInfo(), pushEv.edsUpdatedServices); err != nil {
				return err
			}
		}
		return nil
	}
	...
	currentVersion := versionInfo()
	pushTypes := PushTypeFor(con.node, pushEv)
	// 根據類型判斷推送類型
	if con.CDSWatch && pushTypes[CDS] {
		err := s.pushCds(con, pushEv.push, currentVersion)
		if err != nil {
			return err
		}
	}

	if len(con.Clusters) > 0 && pushTypes[EDS] {
		err := s.pushEds(pushEv.push, con, currentVersion, nil)
		if err != nil {
			return err
		}
	}
	if con.LDSWatch && pushTypes[LDS] {
		err := s.pushLds(con, pushEv.push, currentVersion)
		if err != nil {
			return err
		}
	}
	if len(con.Routes) > 0 && pushTypes[RDS] {
		err := s.pushRoute(con, pushEv.push, currentVersion)
		if err != nil {
			return err
		}
	}
	proxiesConvergeDelay.Record(time.Since(pushEv.start).Seconds())
	return nil
}

這裏會根據pushEv的類型來判斷，需要推送什麼類型的配置信息，下面以EDS爲例看一下pushEds裏面做了什麼：

func (s *DiscoveryServer) pushEds(push *model.PushContext, con *XdsConnection, version string, edsUpdatedServices map[string]struct{}) error {
	pushStart := time.Now()
	loadAssignments := make([]*xdsapi.ClusterLoadAssignment, 0)
	endpoints := 0
	empty := 0

	for _, clusterName := range con.Clusters {
		// 構建生成器生成 EDS
		l := s.generateEndpoints(clusterName, con.node, push, edsUpdatedServices)
		if l == nil {
			continue
		}

		for _, e := range l.Endpoints {
			endpoints += len(e.LbEndpoints)
		}

		if len(l.Endpoints) == 0 {
			empty++
		}
		loadAssignments = append(loadAssignments, l)
	}
	//構建DiscoveryResponse
	response := endpointDiscoveryResponse(loadAssignments, version, push.Version)
	//發送響應
	err := con.send(response)
	edsPushTime.Record(time.Since(pushStart).Seconds())
	...
	return nil
}

pushEds裏面主要就是構建DiscoveryResponse，然後調用send方法發送響應。

Client Request

這部分的代碼和上面的其實差不多，主要是數據的獲取是從reqChannel管道中獲取。

//從XdsConnection中接收來自Envoy的DiscoveryRequest
go receiveThread(con, reqChannel, &receiveError)

for { 
	select {
	case discReq, ok := <-reqChannel:
		if !ok {
			// Remote side closed connection.
			return receiveError
		}
		// This should be only set for the first request. Guard with ID check regardless.
		if discReq.Node != nil && discReq.Node.Id != "" {
			if cancel, err := s.initConnection(discReq.Node, con); err != nil {
				return err
			} else if cancel != nil {
				defer cancel()
			}
		}

		switch discReq.TypeUrl {
		case ClusterType:
			...
			err := s.pushCds(con, s.globalPushContext(), versionInfo())
			if err != nil {
				return err
			}

		case ListenerType:
			...
			err := s.pushLds(con, s.globalPushContext(), versionInfo())
			if err != nil {
				return err
			}

		case RouteType:
			...
			con.Routes = routes
			adsLog.Debugf("ADS:RDS: REQ %s %s routes:%d", peerAddr, con.ConID, len(con.Routes))
			err := s.pushRoute(con, s.globalPushContext(), versionInfo())
			if err != nil {
				return err
			}

		case EndpointType:
			...
			err := s.pushEds(s.globalPushContext(), con, versionInfo(), nil)
			if err != nil {
				return err
			}

		default:
			adsLog.Warnf("ADS: Unknown watched resources %s", discReq.String())
		}

	case pushEv := <-con.pushChannel:
		...
}

這部分會異步掉啓動一個線程用來循環的接受grpc的請求，然後將數據存放到reqChannel管道中，然後在for循環中消費管道中的數據。

總結

到這裏Pilot部分的源碼解析就差不多結束了，回顧一下前兩篇的內容，第一篇主要是講通過service controller來監聽Service、EndPoint、nodes、pods等資源的更新事件；第二篇主要是講通過config controller來監聽Istio的Gateway、DestinationRule及VirtualService等配置變動情況；這篇文章主要講解了xDS協議管理服務器端是如何做的，通過接受service controller以及config controller中的消息，從中獲取各種各樣的資源變動情況，然後建立RPC連接Envoy端的，並告知配置變動。