【問題經驗】記一次Dubbo泛化調用踩坑-zookeeper臨時節點暴增

現象 

    使用dubbo的開發者對dubbo泛化調用肯定不陌生。我們在定時任務管理場景中使用dubbo的泛化調用(泛化調用dubbo接口)。一次,同事在測試環境配置定時任務 10s執行一次,但是配置的接口並沒有提供者。過不多久,zookeeper上就有了上萬個消費者節點。

     官方dubbo泛化調用的示例是這樣的:

ReferenceConfig獲取"服務引用"的時候先取已經實例化的"服務引用",如果沒有實例化過則會調用init來實例化"服務引用"。實例化"服務引用"過程默認要check提供者是否存在,不存在則拋異常導致實例化失敗(此時已經在zookeeper上創建了消費者節點)。下一次通過ReferenceConfig獲取"服務引用"又會失敗(也會創建消費者節點,消費者節點上會帶上時間戳所以每次都會創建新的節點)。如此反覆,就會無窮盡的創建zk節點。 看一下ReferenceConfig獲取"服務引用"的源碼(關鍵位置筆者加了注視):

public class ReferenceConfig<T> extends AbstractReferenceConfig {
    /**
     * The interface proxy reference
     */
    private transient volatile T ref;

    ...
    
    public synchronized T get() {
        checkAndUpdateSubConfigs();

        if (destroyed) {
            throw new IllegalStateException("The invoker of ReferenceConfig(" + url + ") has already destroyed!");
        }

        // 筆者注: 是否已經實例化了"服務引用", 沒有則調用init實例化"服務引用"
        if (ref == null) {
            init();
        }
        return ref;
    }

    //筆者注:初始化"服務引用"
    private void init() {
        if (initialized) {
            return;
        }
        checkStubAndLocal(interfaceClass);
        checkMock(interfaceClass);
        Map<String, String> map = new HashMap<String, String>();

        map.put(SIDE_KEY, CONSUMER_SIDE);

        //筆者注:這裏拼接上了"時間戳"的參數
        appendRuntimeParameters(map);
        if (!isGeneric()) {
            String revision = Version.getVersion(interfaceClass, version);
            if (revision != null && revision.length() > 0) {
                map.put(REVISION_KEY, revision);
            }

            String[] methods = Wrapper.getWrapper(interfaceClass).getMethodNames();
            if (methods.length == 0) {
                logger.warn("No method found in service interface " + interfaceClass.getName());
                map.put(METHODS_KEY, ANY_VALUE);
            } else {
                map.put(METHODS_KEY, StringUtils.join(new HashSet<String>(Arrays.asList(methods)), COMMA_SEPARATOR));
            }
        }
        map.put(INTERFACE_KEY, interfaceName);
        appendParameters(map, metrics);
        appendParameters(map, application);
        appendParameters(map, module);
        // remove 'default.' prefix for configs from ConsumerConfig
        // appendParameters(map, consumer, Constants.DEFAULT_KEY);
        appendParameters(map, consumer);
        appendParameters(map, this);
        Map<String, Object> attributes = null;
        if (CollectionUtils.isNotEmpty(methods)) {
            attributes = new HashMap<String, Object>();
            for (MethodConfig methodConfig : methods) {
                appendParameters(map, methodConfig, methodConfig.getName());
                String retryKey = methodConfig.getName() + ".retry";
                if (map.containsKey(retryKey)) {
                    String retryValue = map.remove(retryKey);
                    if ("false".equals(retryValue)) {
                        map.put(methodConfig.getName() + ".retries", "0");
                    }
                }
                attributes.put(methodConfig.getName(), convertMethodConfig2AyncInfo(methodConfig));
            }
        }

        String hostToRegistry = ConfigUtils.getSystemProperty(DUBBO_IP_TO_REGISTRY);
        if (StringUtils.isEmpty(hostToRegistry)) {
            hostToRegistry = NetUtils.getLocalHost();
        } else if (isInvalidLocalHost(hostToRegistry)) {
            throw new IllegalArgumentException("Specified invalid registry ip from property:" + DUBBO_IP_TO_REGISTRY + ", value:" + hostToRegistry);
        }
        map.put(REGISTER_IP_KEY, hostToRegistry);
        
        // 筆者注: createProxy拋異常的化 ref就沒有設置值,仍然爲空
        ref = createProxy(map);

        String serviceKey = URL.buildKey(interfaceName, group, version);
        ApplicationModel.initConsumerModel(serviceKey, buildConsumerModel(serviceKey, attributes));
        initialized = true;
    }

    //筆者注: 創建"服務引用" 實例,
    //(1)先在zk上創建一個消費者節點
    //(2)校驗服務是否可用(如果配置了需要校驗),不可用拋異常
    //(3)創建一個代理對象
    private T createProxy(Map<String, String> map) {
        if (shouldJvmRefer(map)) {
            URL url = new URL(LOCAL_PROTOCOL, LOCALHOST_VALUE, 0, interfaceClass.getName()).addParameters(map);
            invoker = REF_PROTOCOL.refer(interfaceClass, url);
            if (logger.isInfoEnabled()) {
                logger.info("Using injvm service " + interfaceClass.getName());
            }
        } else {
            urls.clear(); // reference retry init will add url to urls, lead to OOM
            if (url != null && url.length() > 0) { // user specified URL, could be peer-to-peer address, or register center's address.
                String[] us = SEMICOLON_SPLIT_PATTERN.split(url);
                if (us != null && us.length > 0) {
                    for (String u : us) {
                        URL url = URL.valueOf(u);
                        if (StringUtils.isEmpty(url.getPath())) {
                            url = url.setPath(interfaceName);
                        }
                        if (REGISTRY_PROTOCOL.equals(url.getProtocol())) {
                            urls.add(url.addParameterAndEncoded(REFER_KEY, StringUtils.toQueryString(map)));
                        } else {
                            urls.add(ClusterUtils.mergeUrl(url, map));
                        }
                    }
                }
            } else { // assemble URL from register center's configuration
                // if protocols not injvm checkRegistry
                if (!LOCAL_PROTOCOL.equalsIgnoreCase(getProtocol())){
                    checkRegistry();
                    List<URL> us = loadRegistries(false);
                    if (CollectionUtils.isNotEmpty(us)) {
                        for (URL u : us) {
                            URL monitorUrl = loadMonitor(u);
                            if (monitorUrl != null) {
                                map.put(MONITOR_KEY, URL.encode(monitorUrl.toFullString()));
                            }
                            urls.add(u.addParameterAndEncoded(REFER_KEY, StringUtils.toQueryString(map)));
                        }
                    }
                    if (urls.isEmpty()) {
                        throw new IllegalStateException("No such any registry to reference " + interfaceName + " on the consumer " + NetUtils.getLocalHost() + " use dubbo version " + Version.getVersion() + ", please config <dubbo:registry address=\"...\" /> to your spring config.");
                    }
                }
            }

            //筆者注: REF_PROTOCOL是一個SPI擴展,對於使用zookeeper註冊中心來說實際會調用RegistryProtocol.refer.
           //RegistryProtocol會創建消費者節點,消費者節點的path帶上了當前時間戳
            if (urls.size() == 1) {
                invoker = REF_PROTOCOL.refer(interfaceClass, urls.get(0));
            } else {
                List<Invoker<?>> invokers = new ArrayList<Invoker<?>>();
                URL registryURL = null;
                for (URL url : urls) {
                    invokers.add(REF_PROTOCOL.refer(interfaceClass, url));
                    if (REGISTRY_PROTOCOL.equals(url.getProtocol())) {
                        registryURL = url; // use last registry url
                    }
                }
                if (registryURL != null) { // registry url is available
                    // use RegistryAwareCluster only when register's CLUSTER is available
                    URL u = registryURL.addParameter(CLUSTER_KEY, RegistryAwareCluster.NAME);
                    // The invoker wrap relation would be: RegistryAwareClusterInvoker(StaticDirectory) -> FailoverClusterInvoker(RegistryDirectory, will execute route) -> Invoker
                    invoker = CLUSTER.join(new StaticDirectory(u, invokers));
                } else { // not a registry url, must be direct invoke.
                    invoker = CLUSTER.join(new StaticDirectory(invokers));
                }
            }
        }
        
        //筆者注:如果需要check服務提供者 則校驗服務是否可用。如果服務不可用,則直接拋異常,
        //並沒有給ref設置值,但是前面已經在zk上創建了消費者節點
        if (shouldCheck() && !invoker.isAvailable()) {
            throw new IllegalStateException("Failed to check the status of the service " + interfaceName + ". No provider available for the service " + (group == null ? "" : group + "/") + interfaceName + (version == null ? "" : ":" + version) + " from the url " + invoker.getUrl() + " to the consumer " + NetUtils.getLocalHost() + " use dubbo version " + Version.getVersion());
        }
        if (logger.isInfoEnabled()) {
            logger.info("Refer dubbo service " + interfaceClass.getName() + " from url " + invoker.getUrl());
        }
        /**
         * @since 2.7.0
         * ServiceData Store
         */
        MetadataReportService metadataReportService = null;
        if ((metadataReportService = getMetadataReportService()) != null) {
            URL consumerURL = new URL(CONSUMER_PROTOCOL, map.remove(REGISTER_IP_KEY), 0, map.get(INTERFACE_KEY), map);
            metadataReportService.publishConsumer(consumerURL);
        }
        // create service proxy
        return (T) PROXY_FACTORY.getProxy(invoker);
    }
...
}

下面是6次泛型調用服務"com.test.dubbogeneric.TestService"(沒有提供者)之後的zk節點,創建了6個消費者,這些個消費者path除了timestamp不一樣外其餘都一模一樣。

使用url解碼之後更清晰

[
    consumer://172.28.253.130/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000&timestamp=1592036013311,
    consumer://172.28.253.130/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000&timestamp=1592036035633,
    consumer://172.28.253.130/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000&timestamp=1592036036773,
    consumer://172.28.253.130/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000&timestamp=1592036037948,
    consumer://172.28.253.130/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000&timestamp=1592036039105,
    consumer://172.28.253.130/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000&timestamp=1592036040226]

如何解決

解決方案有3個 

  • 1.【推薦】升級dubbo版本到2.7.7(包括)以上,2.7.7版本判斷服務不可用時,執行了destroy操作會刪除之前創建消費者節點
            if (shouldCheck() && !invoker.isAvailable()) {
                //筆者注: 2.7.7版本增加的destroy操作
                invoker.destroy();
                throw new IllegalStateException("Failed to check the status of the service "
                        + interfaceName
                        + ". No provider available for the service "
                        + (group == null ? "" : group + "/")
                        + interfaceName +
                        (version == null ? "" : ":" + version)
                        + " from the url "
                        + invoker.getUrl()
                        + " to the consumer "
                        + NetUtils.getLocalHost() + " use dubbo version " +    Version.getVersion());
            }

     

  • 2.【推薦】泛型調用創建ReferenceConfig時設置check=false,即
    reference.setCheck(false);

    設置false以後,init過程就不會校驗服務是否可用,也就不拋異常,ref就不會爲空。第2次、第3次、第n次獲取"服務引用"時都直接返回第1次的ref,也就不會創建zk節點

  • 3. 捕獲異常,通過java的泛型獲取invoker,然後調用invoker的destroy來刪除zk節點
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章