gRPC服务注册发现及负载均衡的实现方案与源码解析

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"今天聊一下gRPC的服务发现和负载均衡原理相关的话题,不同于"},{"type":"codeinline","content":[{"type":"text","text":"Nginx"}]},{"type":"text","text":"、"},{"type":"codeinline","content":[{"type":"text","text":"Lvs"}]},{"type":"text","text":"或者"},{"type":"codeinline","content":[{"type":"text","text":"F5"}]},{"type":"text","text":"这些服务端的负载均衡策略,gRPC采用的是客户端实现的负载均衡。什么意思呢,对于使用服务端负载均衡的系统,客户端会首先访问负载均衡的域名/IP,再由负载均衡按照策略分发请求到后端具体某个服务节点上。而对于客户端的负载均衡则是,客户端从可用的后端服务节点列表中根据自己的负载均衡策略选择一个节点直连后端服务器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"软件包的"},{"type":"codeinline","content":[{"type":"text","text":"naming"}]},{"type":"text","text":"组件里提供了一个命名解析器(naming resolver)结合"},{"type":"codeinline","content":[{"type":"text","text":"gRPC"}]},{"type":"text","text":"本身自带的"},{"type":"codeinline","content":[{"type":"text","text":"RoundRobin"}]},{"type":"text","text":" 轮询调度负载均衡器,让使用者能方便地搭建起一套服务注册/发现和负载均衡体系。如果轮询调度满足不了调度需求或者不想使用"},{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"作为服务的注册中心和命名解析器的话,可以通过写代码实现"},{"type":"codeinline","content":[{"type":"text","text":"gRPC"}]},{"type":"text","text":"定义的"},{"type":"codeinline","content":[{"type":"text","text":"Resolver"}]},{"type":"text","text":"和"},{"type":"codeinline","content":[{"type":"text","text":"Balancer"}]},{"type":"text","text":"接口来满足系统的自定义需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文引用的源码对应的版本为:gRPC v1.2.x、 Etcd v3.3"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你对gRPC和Etcd还不了解,可以先看看我很早之前写的"},{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&albumid=1358237826197962753&_biz=MzUzNTY5MzU2MA==#wechat_redirect","title":""},"content":[{"type":"text","text":"gRPC入门"}]},{"type":"text","text":"和"},{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&albumid=1574539663539781634&_biz=MzUzNTY5MzU2MA==#wechat_redirect","title":""},"content":[{"type":"text","text":"Etcd入门 "}]},{"type":"text","text":"系列的文章。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"gRPC服务注册发现"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先来简单的说明一下用"},{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"实现服务注册和发现的原理。服务注册和发现这个流程可以用下面这个示意图简单描述出来:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8c/8cd31f8b1ed5d04561522e4ec4e59f27.jpeg","alt":null,"title":"gRPC使用Etcd实现服务发现","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上图的服务A包含了两个节点,服务在节点上启动后,会以包含服务名加节点IP的唯一标识作为Key(比如/service/a/114.128.45.117),服务节点IP和端口信息作为值存储到"},{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"上。这些Key都是带租约的Key,需要我们的服务自己去定期续租,一旦服务节点本身宕掉,比如node2上的服务宕掉,无法完成续租后,那么它对应的Key:/service/a/114.128.45.117 就会过期,客户端也就无法再从Etcd上获取到这个服务节点的信息了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"与此同时客户端也会利用"},{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"的"},{"type":"codeinline","content":[{"type":"text","text":"Watch"}]},{"type":"text","text":"功能监听以"},{"type":"codeinline","content":[{"type":"text","text":"/servive/a"}]},{"type":"text","text":"为前缀的所有Key的变化,如果有新增或者删除节点Key的事件发生"},{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"都会通过"},{"type":"codeinline","content":[{"type":"text","text":"WatchChan"}]},{"type":"text","text":"发送给客户端,"},{"type":"codeinline","content":[{"type":"text","text":"WatchChan"}]},{"type":"text","text":"在编程语言上的实现就是"},{"type":"codeinline","content":[{"type":"text","text":"Go"}]},{"type":"text","text":"的"},{"type":"codeinline","content":[{"type":"text","text":"Channel"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"服务注册"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"关于"},{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"的服务注册,官方提供的软件包里并没有提供统一的注册函数供调用。那么我们在新增服务节点后怎么把节点的信息存储到"},{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"上并通知给命名解析器呢?在Etcd源码包的naming/grpc.go里可以发现提供了一个"},{"type":"codeinline","content":[{"type":"text","text":"Update"}]},{"type":"text","text":"方法,这个"},{"type":"codeinline","content":[{"type":"text","text":"Update"}]},{"type":"text","text":"既能执行添加也能执行删除操作:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"go"},"content":[{"type":"text","text":"func (gr *GRPCResolver) Update(ctx context.Context, target string, nm naming.Update, opts ...etcd.OpOption) (err error) {\n\tswitch nm.Op {\n\tcase naming.Add:\n\t\tvar v []byte\n\t\tif v, err = json.Marshal(nm); err != nil {\n\t\t\treturn status.Error(codes.InvalidArgument, err.Error())\n\t\t}\n\t\t_, err = gr.Client.KV.Put(ctx, target+\"/\"+nm.Addr, string(v), opts...)\n\tcase naming.Delete:\n\t\t_, err = gr.Client.Delete(ctx, target+\"/\"+nm.Addr, opts...)\n\tdefault:\n\t\treturn status.Error(codes.InvalidArgument, \"naming: bad naming op\")\n\t}\n\treturn err\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服务在启动完成后可以通过"},{"type":"codeinline","content":[{"type":"text","text":"Update"}]},{"type":"text","text":"方法把自己的服务地址和端口"},{"type":"codeinline","content":[{"type":"text","text":"Put"}]},{"type":"text","text":"到自定义的target为前缀的key里,针对上面图示里的例子,变量target就应该是我们定义的服务名/service/a。一般在具体实践里都是自己根据系统的需求封装"},{"type":"codeinline","content":[{"type":"text","text":"Update"}]},{"type":"text","text":"方法完成服务注册,以及服务节点Key在Etcd上的定期续租,这块每个公司的实践都不一样,我就不放具体的代码了,一般续租都是通过"},{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"租约里的"},{"type":"codeinline","content":[{"type":"text","text":"KeepAlive"}]},{"type":"text","text":"方法实现的(Lease.KeepAlive)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"服务发现"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在注册完新节点、或者是原来的节点停掉后,客户端是怎么知道的呢?这块就需要命名解析器Resolver来帮助实现了,Resolver的作用可以理解为从一个字符串映射到一组IP端口等信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"gRPC对Resolver的接口定义如下:"}]},{"type":"codeblock","attrs":{"lang":"go"},"content":[{"type":"text","text":"type Resolver interface {\n\t// Resolve creates a Watcher for target.\n\tResolve(target string) (Watcher, error)\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"命名解析器的Resolve方法会返回一个Watcher,这个Watcher可以监听命名解析器发来的target(类似上面例子里说的与服务名相对应的Key)对应的后端服务器地址信息变化,通知Balancer对自己维护的地址进行动态地增删。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Watcher接口的定义如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"go"},"content":[{"type":"text","text":"//源码地址 https://github.com/grpc/grpc-go/blob/v1.2.x/naming/naming.go\ntype Watcher interface {\n\tNext() ([]*Update, error)\n\t// Close closes the Watcher.\n\tClose()\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Etcd为这两个接口都提供了实现:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"go"},"content":[{"type":"text","text":"// 源码地址:https://github.com/etcd-io/etcd/blob/release-3.3/clientv3/naming/grpc.go\n\n// GRPCResolver 实现了grpc的naming.Resolver接口\ntype GRPCResolver struct {\n\t// Client is an initialized etcd client.\n\tClient *etcd.Client\n}\n\nfunc (gr *GRPCResolver) Resolve(target string) (naming.Watcher, error) {\n\tctx, cancel := context.WithCancel(context.Background())\n\tw := &gRPCWatcher{c: gr.Client, target: target + \"/\", ctx: ctx, cancel: cancel}\n\treturn w, nil\n}\n\n// 实现了grpc的naming.Watcher接口\ntype gRPCWatcher struct {\n\tc *etcd.Client\n\ttarget string\n\tctx context.Context\n\tcancel context.CancelFunc\n\twch etcd.WatchChan\n\terr error\n}\n\nfunc (gw *gRPCWatcher) Next() ([]*naming.Update, error) {\n\tif gw.wch == nil {\n\t\t// first Next() returns all addresses\n\t\treturn gw.firstNext()\n\t}\n\n\t// process new events on target/*\n\twr, ok := 1时,client是不会阻塞的。\n if cnt == 1 && rr.waitCh != nil {\n close(rr.waitCh)\n rr.waitCh = nil\n }\n //返回禁用该地址的方法\n return func(err error) {\n rr.down(addr, err)\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"关闭连接"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"关闭连接使用的是Down方法,这个方法就简单, 直接找到addr置为不可用就行了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"go"},"content":[{"type":"text","text":"func (rr *roundRobin) down(addr Address, err error) {\n rr.mu.Lock()\n defer rr.mu.Unlock()\n for _, a := range rr.addrs {\n if addr == a.addr {\n a.connected = false\n break\n }\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"客户端获取连接"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"客户端在调用"},{"type":"codeinline","content":[{"type":"text","text":"gRPC"}]},{"type":"text","text":"具体"},{"type":"codeinline","content":[{"type":"text","text":"Method"}]},{"type":"text","text":"的"},{"type":"codeinline","content":[{"type":"text","text":"Invoke"}]},{"type":"text","text":"方法里,会去"},{"type":"codeinline","content":[{"type":"text","text":"RoundRobin"}]},{"type":"text","text":"的连接池addrs里获取连接,如果addrs为空,或者addrs里的地址都不可用,"},{"type":"codeinline","content":[{"type":"text","text":"Get()"}]},{"type":"text","text":"方法会返回错误。但是如果设置了"},{"type":"codeinline","content":[{"type":"text","text":"failfast = false"}]},{"type":"text","text":","},{"type":"codeinline","content":[{"type":"text","text":"Get()"}]},{"type":"text","text":"方法会阻塞在"},{"type":"codeinline","content":[{"type":"text","text":"waitCh"}]},{"type":"text","text":"这个通道上,直至"},{"type":"codeinline","content":[{"type":"text","text":"Up"}]},{"type":"text","text":"方法给到通知,然后轮询调度可用的地址。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"go"},"content":[{"type":"text","text":"func (rr *roundRobin) Get(ctx context.Context, opts BalancerGetOptions) (addr Address, put func(), err error) {\n var ch chan struct{}\n rr.mu.Lock()\n if rr.done {\n rr.mu.Unlock()\n err = ErrClientConnClosing\n return\n }\n \n if len(rr.addrs) > 0 {\n // addrs的长度可能变化,如果next值超出了,就置为0,从头开始调度。\n if rr.next >= len(rr.addrs) {\n rr.next = 0\n }\n next := rr.next\n //遍历整个addrs数组,直到选出一个可用的地址\n for {\n a := rr.addrs[next]\n // next值加一,当然是循环的,到len(addrs)后,变为0\n next = (next + 1) % len(rr.addrs)\n if a.connected {\n addr = a.addr\n rr.next = next\n rr.mu.Unlock()\n return\n }\n if next == rr.next {\n // 遍历完一圈了,还没找到,走下面逻辑\n break\n }\n }\n }\n if !opts.BlockingWait { //如果是非阻塞模式,如果没有可用地址,那么报错\n if len(rr.addrs) == 0 {\n rr.mu.Unlock()\n err = status.Errorf(codes.Unavailable, \"there is no address available\")\n return\n }\n // Returns the next addr on rr.addrs for failfast RPCs.\n addr = rr.addrs[rr.next].addr\n rr.next++\n rr.mu.Unlock()\n return\n }\n // Wait on rr.waitCh for non-failfast RPCs.\n // 如果是阻塞模式,那么需要阻塞在waitCh上,直到Up方法给通知\n if rr.waitCh == nil {\n ch = make(chan struct{})\n rr.waitCh = ch\n } else {\n ch = rr.waitCh\n }\n rr.mu.Unlock()\n for {\n select {\n case 0 {\n if rr.next >= len(rr.addrs) {\n rr.next = 0\n }\n next := rr.next\n for {\n a := rr.addrs[next]\n next = (next + 1) % len(rr.addrs)\n if a.connected {\n addr = a.addr\n rr.next = next\n rr.mu.Unlock()\n return\n }\n if next == rr.next {\n // 遍历完一圈了,还没找到,可能刚Up的地址被down掉了,重新等待。\n break\n }\n }\n }\n // The newly added addr got removed by Down() again.\n if rr.waitCh == nil {\n ch = make(chan struct{})\n rr.waitCh = ch\n } else {\n ch = rr.waitCh\n }\n rr.mu.Unlock()\n }\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"总结"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整个"},{"type":"codeinline","content":[{"type":"text","text":"gRPC"}]},{"type":"text","text":"基于"},{"type":"codeinline","content":[{"type":"text","text":"Etcd"}]},{"type":"text","text":"实现服务注册/发现以及负载均衡的流程和关键的源码实现就梳理完了,其实源码实现的细节远比我这里列举的要复杂,这篇文章的目的也是希望能记录下一学习和实践gRPC的负载均衡和服务解析时的一些关键路径。另外需要注意的是本文里使用的是gRPC v1.2.x的代码,在1.3版本后官方包重新调整了目录和包名,与本文里列举的源码以及Balancer的使用上都会有些出入,不过原理还是大致一样的,只不过每一版都一直在此基础上演进。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"看到这里了,如果喜欢我的文章可以帮我点个赞,我会每周通过技术文章分享我的所学所见和第一手实践经验,感谢你的支持。微信搜索关注公众号「网管叨bi叨」第一时间获取我的文章推送。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章