SkyDNS2源碼分析

SkyDNS2是SkyDNS Version 2.x的統稱,其官方文檔只有README.md,網上能找到的資料也不多,因此需要我們自行對代碼進行一定的分析,才能對其有更好的理解,這就是本文的工作,通過走讀SkyDNS的代碼,瞭解其內部架構及其工作原理。

說明

SkyDNS架構

關於SkyDNS是什麼?…. 這些知識,請前往官網瞭解。

下面我直接把我閱讀代碼後理解的SkyDNS架構貼出來:

這裏寫圖片描述

SkyDNS工作原理

SkyDNS Server的工作,依賴後端Key-Value存儲的支持。當前支持etcd或etcd3作爲Backend(架構圖中藍色部分),爲SkyDNS提供配置和數據的管理。

通過環境變量ETCD_MACHINES進行etcd cluster的配置,如果Backend爲etcd3,還需要設置etcd中/v2/keys//skydns/config/etcd3爲true。SkyDNS中有etcd client模塊,負責與ETCD_MACHINES的通信。

SkyDNS主要對應的etcd key path如下:

/v2/keys/skydns/config
/v2/keys/skydns/local/skydns/east/production/rails
/v2/keys/skydns/local/skydns/dns/stub
/v2/keys/skydns/local/skydns/...

通過如下環境變量的配置,支持prometheus監控(架構圖中棕色部分)。如果想disable prometheus監控,則配置環境變量PROMETHEUS_PORT的值爲0即可。

Port      = os.Getenv("PROMETHEUS_PORT")
Path      = envOrDefault("PROMETHEUS_PATH", "/metrics")
Namespace = envOrDefault("PROMETHEUS_NAMESPACE", "skydns")
Subsystem = envOrDefault("PROMETHEUS_SUBSYSTEM", "skydns")

如果/v2/keys/skydns/config/nameservers有值,則SkyDNS解析不了的Domain,會forward到對應的這些IP:Port構成的nameservers,由它們進行解析(架構圖中綠色部分)。

參考官方文檔https://github.com/skynetservices/skydns/blob/master/README.md完成參數配置後,便可啓動SkyDNS。

SkyDNS Server的啓動過程如下:

  • 創建etcd client對象;
  • dns_addr 和 nameservers參數合法性檢查;
  • 加載啓動參數到etcd,覆蓋/v2/keys/skydns/config中原有數據;
  • 配置SkyDNS Server參數的default值,並創建SkyDNS server對象;
  • 去etcd中加載…/dns/stub//xx數據作爲server的stub zones數據,並啓動對…/dns/stub/的watcher,一旦有數據更新,就加載到server的stub zones數據中;
  • 註冊SkyDNS metrics到prometheus;
  • 然後在/v2/keys/skydns/config/dns_addr配置的interface和port上開啓tcp/udp監聽服務並block住,由此開始提供DSN服務。

在github.com/skynetservices/skydns/server/server.go中的ServeDNS方法覆蓋了miekg/dns/server中的ServeMux.ServeDNS方法,由自實現的ServeDNS提供來處理DNS client的請求。

github.com/skynetservices/skydns/server/server.go

// ServeDNS is the handler for DNS requests, responsible for parsing DNS request, possibly forwarding
// it to a real dns server and returning a response.
func (s *server) ServeDNS(w dns.ResponseWriter, req *dns.Msg) {
    ...
    // Check cache first.
    m1 := s.rcache.Hit(q, dnssec, tcp, m.Id)
    if m1 != nil {
        ...
        // Still round-robin even with hits from the cache.
        // Only shuffle A and AAAA records with each other.
        if q.Qtype == dns.TypeA || q.Qtype == dns.TypeAAAA {
            s.RoundRobin(m1.Answer)
        }

        ...
        return
    }

    for zone, ns := range *s.config.stub {
        if strings.HasSuffix(name, "." + zone) || name == zone {
            metrics.ReportRequestCount(req, metrics.Stub)

            resp := s.ServeDNSStubForward(w, req, ns)
            if resp != nil {
                s.rcache.InsertMessage(cache.Key(q, dnssec, tcp), resp)
            }

            metrics.ReportDuration(resp, start, metrics.Stub)
            metrics.ReportErrorCount(resp, metrics.Stub)
            return
        }
    }
    ...

    if name == s.config.Domain {
        if q.Qtype == dns.TypeSOA {
            m.Answer = []dns.RR{s.NewSOA()}
            return
        }
        if q.Qtype == dns.TypeDNSKEY {
            if s.config.PubKey != nil {
                m.Answer = []dns.RR{s.config.PubKey}
                return
            }
        }
    }
    if q.Qclass == dns.ClassCHAOS {
        if q.Qtype == dns.TypeTXT {
            switch name {
            case "authors.bind.":
                fallthrough
            case s.config.Domain:
                hdr := dns.RR_Header{Name: q.Name, Rrtype: dns.TypeTXT, Class: dns.ClassCHAOS, Ttl: 0}
                authors := []string{"Erik St. Martin", "Brian Ketelsen", "Miek Gieben", "Michael Crosby"}
                for _, a := range authors {
                    m.Answer = append(m.Answer, &dns.TXT{Hdr: hdr, Txt: []string{a}})
                }
                for j := 0; j < len(authors)*(int(dns.Id())%4+1); j++ {
                    q := int(dns.Id()) % len(authors)
                    p := int(dns.Id()) % len(authors)
                    if q == p {
                        p = (p + 1) % len(authors)
                    }
                    m.Answer[q], m.Answer[p] = m.Answer[p], m.Answer[q]
                }
                return
            case "version.bind.":
                fallthrough
            case "version.server.":
                hdr := dns.RR_Header{Name: q.Name, Rrtype: dns.TypeTXT, Class: dns.ClassCHAOS, Ttl: 0}
                m.Answer = []dns.RR{&dns.TXT{Hdr: hdr, Txt: []string{Version}}}
                return
            case "hostname.bind.":
                fallthrough
            case "id.server.":
                // TODO(miek): machine name to return
                hdr := dns.RR_Header{Name: q.Name, Rrtype: dns.TypeTXT, Class: dns.ClassCHAOS, Ttl: 0}
                m.Answer = []dns.RR{&dns.TXT{Hdr: hdr, Txt: []string{"localhost"}}}
                return
            }
        }
        // still here, fail
        m.SetReply(req)
        m.SetRcode(req, dns.RcodeServerFailure)
        return
    }

    switch q.Qtype {
    case dns.TypeNS:
        if name != s.config.Domain {
            break
        }
        // Lookup s.config.DnsDomain
        records, extra, err := s.NSRecords(q, s.config.dnsDomain)
        if isEtcdNameError(err, s) {
            m = s.NameError(req)
            return
        }
        m.Answer = append(m.Answer, records...)
        m.Extra = append(m.Extra, extra...)
    case dns.TypeA, dns.TypeAAAA:
        records, err := s.AddressRecords(q, name, nil, bufsize, dnssec, false)
        if isEtcdNameError(err, s) {
            m = s.NameError(req)
            return
        }
        m.Answer = append(m.Answer, records...)
    case dns.TypeTXT:
        records, err := s.TXTRecords(q, name)
        if isEtcdNameError(err, s) {
            m = s.NameError(req)
            return
        }
        m.Answer = append(m.Answer, records...)
    case dns.TypeCNAME:
        records, err := s.CNAMERecords(q, name)
        if isEtcdNameError(err, s) {
            m = s.NameError(req)
            return
        }
        m.Answer = append(m.Answer, records...)
    case dns.TypeMX:
        records, extra, err := s.MXRecords(q, name, bufsize, dnssec)
        if isEtcdNameError(err, s) {
            m = s.NameError(req)
            return
        }
        m.Answer = append(m.Answer, records...)
        m.Extra = append(m.Extra, extra...)
    default:
        fallthrough // also catch other types, so that they return NODATA
    case dns.TypeSRV:
        records, extra, err := s.SRVRecords(q, name, bufsize, dnssec)
        if err != nil {
            if isEtcdNameError(err, s) {
                m = s.NameError(req)
                return
            }
            logf("got error from backend: %s", err)
            if q.Qtype == dns.TypeSRV { // Otherwise NODATA
                m = s.ServerFailure(req)
                return
            }
        }
        // if we are here again, check the types, because an answer may only
        // be given for SRV. All other types should return NODATA, the
        // NXDOMAIN part is handled in the above code. TODO(miek): yes this
        // can be done in a more elegant manor.
        if q.Qtype == dns.TypeSRV {
            m.Answer = append(m.Answer, records...)
            m.Extra = append(m.Extra, extra...)
        }
    }

    if len(m.Answer) == 0 { // NODATA response
        m.Ns = []dns.RR{s.NewSOA()}
        m.Ns[0].Header().Ttl = s.config.MinTtl
    }
}

上面代碼邏輯比較複雜,細節上需要你慢慢去理解,簡短的可以總結如下:

  • 如架構圖中標註的線路1:如果在SkyDNS維護的cache中找到對應Msg,則從cache中讀取並返回Msg給DNS client;
  • 如架構圖中標註的線路2:如果在cache中沒有對應的記錄,並且是需要DNS forward的場景(比如name匹配到stub zones等),則將請求forward到對應的DNS servers進行處理;
  • 如架構圖中標註的線路3:如果在cache中沒有對應的記錄,並且Question Type爲A/AAAA,SRV等類型時,就通過etcd client去etcd cluster中獲取對應的Rule,並構造Msg返回。

總結

通過走讀SkyDNS的代碼,瞭解其內部架構及其工作原理。

Mark

SkyDNS2 Changes since version 1:

  • Does away with Raft and uses etcd (which uses raft).
  • Makes it possible to query arbitrary domain names.
  • Is a thin layer above etcd, that translates etcd keys and values to the DNS.
  • Does DNSSEC with NSEC3 instead of NSEC.
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章