記錄一次線上內存問題的排查過程

所用工具MAT、IDEA

一、發現問題

線上有一個微服務內存已經將近90%,回收不過來,導致頻繁gc,cpu也跟着從20%升至40%。

先臨時升級機器內存,情況得到緩解,內存回到50%,cpu也降了下來,但是內存還在緩慢增長。

二、定位問題

第一步:懷疑有內存泄露。連續2天,dump了2份該微服務內存。

發現有一個ConcurrentHashMap持有的內存特別大,佔整個堆內存的一半。查看這個類下具體的對象內容,發現它的key是dubbo.monitor下的Statistics類。

dump圖
在這裏插入圖片描述
查看dubbo源碼,發現dubbo.monitor會對每次接口調用進行統計,記錄哪個client調用哪個server的哪個method多少次。

public class DubboMonitor implements Monitor {

    private static final Logger logger = LoggerFactory.getLogger(DubboMonitor.class);

    /**
     * The length of the array which is a container of the statistics
     */
    private static final int LENGTH = 10;

    /**
     * The timer for sending statistics
     */
    private final ScheduledExecutorService scheduledExecutorService = Executors.newScheduledThreadPool(3, new NamedThreadFactory("DubboMonitorSendTimer", true));

    /**
     * The future that can cancel the <b>scheduledExecutorService</b>
     */
    private final ScheduledFuture<?> sendFuture;

    private final Invoker<MonitorService> monitorInvoker;

    private final MonitorService monitorService;

    /**
     * The time interval for timer <b>scheduledExecutorService</b> to send data
     */
    private final long monitorInterval;

    private final ConcurrentMap<Statistics, AtomicReference<long[]>> statisticsMap = new ConcurrentHashMap<Statistics, AtomicReference<long[]>>();

    public DubboMonitor(Invoker<MonitorService> monitorInvoker, MonitorService monitorService) {
        this.monitorInvoker = monitorInvoker;
        this.monitorService = monitorService;
        this.monitorInterval = monitorInvoker.getUrl().getPositiveParameter("interval", 60000);
        // collect timer for collecting statistics data
        sendFuture = scheduledExecutorService.scheduleWithFixedDelay(() -> {
            try {
                // collect data
                send();
            } catch (Throwable t) {
                logger.error("Unexpected error occur at send statistic, cause: " + t.getMessage(), t);
            }
        }, monitorInterval, monitorInterval, TimeUnit.MILLISECONDS);
    }

    public void send() {
        logger.debug("Send statistics to monitor " + getUrl());
        String timestamp = String.valueOf(System.currentTimeMillis());
        for (Map.Entry<Statistics, AtomicReference<long[]>> entry : statisticsMap.entrySet()) {
            // get statistics data
            Statistics statistics = entry.getKey();
            AtomicReference<long[]> reference = entry.getValue();
            long[] numbers = reference.get();
            long success = numbers[0];
            long failure = numbers[1];
            long input = numbers[2];
            long output = numbers[3];
            long elapsed = numbers[4];
            long concurrent = numbers[5];
            long maxInput = numbers[6];
            long maxOutput = numbers[7];
            long maxElapsed = numbers[8];
            long maxConcurrent = numbers[9];
            String protocol = getUrl().getParameter(DEFAULT_PROTOCOL);

            // send statistics data
            URL url = statistics.getUrl()
                    .addParameters(MonitorService.TIMESTAMP, timestamp,
                            MonitorService.SUCCESS, String.valueOf(success),
                            MonitorService.FAILURE, String.valueOf(failure),
                            MonitorService.INPUT, String.valueOf(input),
                            MonitorService.OUTPUT, String.valueOf(output),
                            MonitorService.ELAPSED, String.valueOf(elapsed),
                            MonitorService.CONCURRENT, String.valueOf(concurrent),
                            MonitorService.MAX_INPUT, String.valueOf(maxInput),
                            MonitorService.MAX_OUTPUT, String.valueOf(maxOutput),
                            MonitorService.MAX_ELAPSED, String.valueOf(maxElapsed),
                            MonitorService.MAX_CONCURRENT, String.valueOf(maxConcurrent),
                            DEFAULT_PROTOCOL, protocol
                    );
            monitorService.collect(url);

            // reset
            long[] current;
            long[] update = new long[LENGTH];
            do {
                current = reference.get();
                if (current == null) {
                    update[0] = 0;
                    update[1] = 0;
                    update[2] = 0;
                    update[3] = 0;
                    update[4] = 0;
                    update[5] = 0;
                } else {
                    update[0] = current[0] - success;
                    update[1] = current[1] - failure;
                    update[2] = current[2] - input;
                    update[3] = current[3] - output;
                    update[4] = current[4] - elapsed;
                    update[5] = current[5] - concurrent;
                }
            } while (!reference.compareAndSet(current, update));
        }
    }

    @Override
    public void collect(URL url) {
        // data to collect from url
        int success = url.getParameter(MonitorService.SUCCESS, 0);
        int failure = url.getParameter(MonitorService.FAILURE, 0);
        int input = url.getParameter(MonitorService.INPUT, 0);
        int output = url.getParameter(MonitorService.OUTPUT, 0);
        int elapsed = url.getParameter(MonitorService.ELAPSED, 0);
        int concurrent = url.getParameter(MonitorService.CONCURRENT, 0);
        // init atomic reference
        Statistics statistics = new Statistics(url);
        AtomicReference<long[]> reference = statisticsMap.get(statistics);
        if (reference == null) {
            statisticsMap.putIfAbsent(statistics, new AtomicReference<long[]>());
            reference = statisticsMap.get(statistics);
        }
        // use CompareAndSet to sum
        long[] current;
        long[] update = new long[LENGTH];
        do {
            current = reference.get();
            if (current == null) {
                update[0] = success;
                update[1] = failure;
                update[2] = input;
                update[3] = output;
                update[4] = elapsed;
                update[5] = concurrent;
                update[6] = input;
                update[7] = output;
                update[8] = elapsed;
                update[9] = concurrent;
            } else {
                update[0] = current[0] + success;
                update[1] = current[1] + failure;
                update[2] = current[2] + input;
                update[3] = current[3] + output;
                update[4] = current[4] + elapsed;
                update[5] = (current[5] + concurrent) / 2;
                update[6] = current[6] > input ? current[6] : input;
                update[7] = current[7] > output ? current[7] : output;
                update[8] = current[8] > elapsed ? current[8] : elapsed;
                update[9] = current[9] > concurrent ? current[9] : concurrent;
            }
        } while (!reference.compareAndSet(current, update));
    }

    @Override
    public List<URL> lookup(URL query) {
        return monitorService.lookup(query);
    }

    @Override
    public URL getUrl() {
        return monitorInvoker.getUrl();
    }

    @Override
    public boolean isAvailable() {
        return monitorInvoker.isAvailable();
    }

    @Override
    public void destroy() {
        try {
            ExecutorUtil.cancelScheduledFuture(sendFuture);
        } catch (Throwable t) {
            logger.error("Unexpected error occur at cancel sender timer, cause: " + t.getMessage(), t);
        }
        monitorInvoker.destroy();
    }

}

看着好像沒什麼問題,不存在內存泄露,這應該是dubbo的常規操作。但是爲了統計這些東西佔用了這麼多堆空間,dubbo設計的是不是不合理啊…… 別的微服務堆內存也是這樣麼?爲什麼它們的內存沒有出現緩慢增長?

第二步:dump了另外2個正常微服務內存對比看看。

一看!震驚了!!另外2個微服務裏沒有發現一個持有特別大的內存的ConcurrentHashMap!!!

按照佔用內存排序,只有一個DubboMonitor和dubbo相關,佔得內存稍微多點,但和那個ConcurrentHashMap也不是一個量級的。

打開DubboMonitor查看裏面的對象,發現和ConcurrentHashMap存儲的內容是類似的,也是記錄了dubbo的監控信息。

在這裏插入圖片描述

那就出現了2個問題:

  1. 爲什麼異常微服務和正常微服務dubbo監控的內容結構不一樣?
  2. 同樣是記錄dubbo的監控信息,這些微服務都是2C的,存儲的內存結構也是類似的,那爲什麼佔用內存差這麼多?

第1個問題,在對照代碼後發現,異常微服務用的是Apache的dubbo2.7.3(阿里版、Apache版),正常微服務是用的Alibaba的dubbo2.8.4(噹噹版)

dubbo的版本比較混亂,這裏簡單介紹一下
dubbo最開始是Alibaba做的,後來貢獻給了Apache。目前最新的版本是2.7.6
另外,還有一個噹噹版的分支,項目名稱叫做dubbox。但是因爲是基於Alibaba dubbo開發的,所以包的名稱還都是alibaba.dubbo。目前最新的版本是2.8.4

第2個問題,再詳細分析它們的存儲的內容後,發現一個問題:異常微服務的dubbo監控信息裏的client的ip都是公網ip(推斷是用戶ip),正常微服務的dubbo監控信息裏的client的ip都是私網ip。(與運維覈對後發現是網關ip)用戶ip和網關ip當然不是一個數量級的,所有異常微服務的dubbo監控使用了大量的內存……

第三步:看源碼。爲什麼一個記錄的用戶ip,一個記錄的網關ip?

client是從哪裏取的值呢?應該是remote_addr或者header.x-forwarded-for

如果從remote_addr裏取值的話應該取的是網關的ip,如果從header.x-forwarded-for裏取值應該取的是用戶的真實ip

看dubbo源碼,發現兩個版本的dubbo的MonitorFilter處理流程確實是不一樣的:

Apache版的MonitorFilter

public class MonitorFilter extends ListenableFilter {

...

	@Override
    public Result invoke(Invoker<?> invoker, Invocation invocation) throws RpcException {
        if (invoker.getUrl().hasParameter(MONITOR_KEY)) {
            invocation.setAttachment(MONITOR_FILTER_START_TIME, String.valueOf(System.currentTimeMillis()));
            getConcurrent(invoker, invocation).incrementAndGet(); // count up
        }
        return invoker.invoke(invocation); // proceed invocation chain
    }
	
...

	class MonitorListener implements Listener {

        @Override
        public void onResponse(Result result, Invoker<?> invoker, Invocation invocation) {
            if (invoker.getUrl().hasParameter(MONITOR_KEY)) {
                collect(invoker, invocation, result, RpcContext.getContext().getRemoteHost(), Long.valueOf(invocation.getAttachment(MONITOR_FILTER_START_TIME)), false);
                getConcurrent(invoker, invocation).decrementAndGet(); // count down
            }
        }

        @Override
        public void onError(Throwable t, Invoker<?> invoker, Invocation invocation) {
            if (invoker.getUrl().hasParameter(MONITOR_KEY)) {
                collect(invoker, invocation, null, RpcContext.getContext().getRemoteHost(), Long.valueOf(invocation.getAttachment(MONITOR_FILTER_START_TIME)), true);
                getConcurrent(invoker, invocation).decrementAndGet(); // count down
            }
        }

...
	
}

噹噹版的MonitorFilter

public class MonitorFilter implements Filter {

...

	public Result invoke(Invoker<?> invoker, Invocation invocation) throws RpcException {
        if (invoker.getUrl().hasParameter(Constants.MONITOR_KEY)) {
            RpcContext context = RpcContext.getContext(); // 提供方必須在invoke()之前獲取context信息
            long start = System.currentTimeMillis(); // 記錄起始時間戮
            getConcurrent(invoker, invocation).incrementAndGet(); // 併發計數
            try {
                Result result = invoker.invoke(invocation); // 讓調用鏈往下執行
                collect(invoker, invocation, result, context, start, false);
                return result;
            } catch (RpcException e) {
                collect(invoker, invocation, null, context, start, true);
                throw e;
            } finally {
                getConcurrent(invoker, invocation).decrementAndGet(); // 併發計數
            }
        } else {
            return invoker.invoke(invocation);
        }
    }

...

}

最主要的區別是:context的獲取時機不一樣,Apache版是在所有的invoke都調用完執行collect時才獲取context;噹噹版是在invoke執行之前先保存了一份context。爲什麼要提前保存一份,難道在執行invoke的時候會改變context的值麼?其實兩個版本collect函數內部的處理和對context處理都有不同,但是後來發現不是這裏的問題,所以這裏不再展開

第四步:看代碼無果,開啓debug調試大法(真相在即)

本地debug,追蹤ip來源。

發現線程棧幀不同,異常微服務多了一個棧幀RemoteIpValue
堆棧對比圖

看源碼發現是tomcat的類,用來從header.x-forwarded-for裏獲取用戶ip的


public class RemoteIpValve extends ValveBase {

...

    public void invoke(Request request, Response response) throws IOException, ServletException {
        final String originalRemoteAddr = request.getRemoteAddr();
        final String originalRemoteHost = request.getRemoteHost();
        final String originalScheme = request.getScheme();
        final boolean originalSecure = request.isSecure();
        final int originalServerPort = request.getServerPort();
        final String originalProxiesHeader = request.getHeader(proxiesHeader);
        final String originalRemoteIpHeader = request.getHeader(remoteIpHeader);
        boolean isInternal = internalProxies != null &&
                internalProxies.matcher(originalRemoteAddr).matches();

        if (isInternal || (trustedProxies != null &&
                trustedProxies.matcher(originalRemoteAddr).matches())) {
            String remoteIp = null;
            // In java 6, proxiesHeaderValue should be declared as a java.util.Deque
            LinkedList<String> proxiesHeaderValue = new LinkedList<>();
            StringBuilder concatRemoteIpHeaderValue = new StringBuilder();

            for (Enumeration<String> e = request.getHeaders(remoteIpHeader); e.hasMoreElements();) {
                if (concatRemoteIpHeaderValue.length() > 0) {
                    concatRemoteIpHeaderValue.append(", ");
                }

                concatRemoteIpHeaderValue.append(e.nextElement());
            }

            String[] remoteIpHeaderValue = commaDelimitedListToStringArray(concatRemoteIpHeaderValue.toString());
            int idx;
            if (!isInternal) {
                proxiesHeaderValue.addFirst(originalRemoteAddr);
            }
            // loop on remoteIpHeaderValue to find the first trusted remote ip and to build the proxies chain
            for (idx = remoteIpHeaderValue.length - 1; idx >= 0; idx--) {
                String currentRemoteIp = remoteIpHeaderValue[idx];
                remoteIp = currentRemoteIp;
                if (internalProxies !=null && internalProxies.matcher(currentRemoteIp).matches()) {
                    // do nothing, internalProxies IPs are not appended to the
                } else if (trustedProxies != null &&
                        trustedProxies.matcher(currentRemoteIp).matches()) {
                    proxiesHeaderValue.addFirst(currentRemoteIp);
                } else {
                    idx--; // decrement idx because break statement doesn't do it
                    break;
                }
            }
            // continue to loop on remoteIpHeaderValue to build the new value of the remoteIpHeader
            LinkedList<String> newRemoteIpHeaderValue = new LinkedList<>();
            for (; idx >= 0; idx--) {
                String currentRemoteIp = remoteIpHeaderValue[idx];
                newRemoteIpHeaderValue.addFirst(currentRemoteIp);
            }
            if (remoteIp != null) {

                request.setRemoteAddr(remoteIp);
                request.setRemoteHost(remoteIp);

                if (proxiesHeaderValue.size() == 0) {
                    request.getCoyoteRequest().getMimeHeaders().removeHeader(proxiesHeader);
                } else {
                    String commaDelimitedListOfProxies = listToCommaDelimitedString(proxiesHeaderValue);
                    request.getCoyoteRequest().getMimeHeaders().setValue(proxiesHeader).setString(commaDelimitedListOfProxies);
                }
                if (newRemoteIpHeaderValue.size() == 0) {
                    request.getCoyoteRequest().getMimeHeaders().removeHeader(remoteIpHeader);
                } else {
                    String commaDelimitedRemoteIpHeaderValue = listToCommaDelimitedString(newRemoteIpHeaderValue);
                    request.getCoyoteRequest().getMimeHeaders().setValue(remoteIpHeader).setString(commaDelimitedRemoteIpHeaderValue);
                }
            }

            if (protocolHeader != null) {
                String protocolHeaderValue = request.getHeader(protocolHeader);
                if (protocolHeaderValue == null) {
                    // Don't modify the secure, scheme and serverPort attributes
                    // of the request
                } else if (isForwardedProtoHeaderValueSecure(protocolHeaderValue)) {
                    request.setSecure(true);
                    request.getCoyoteRequest().scheme().setString("https");
                    setPorts(request, httpsServerPort);
                } else {
                    request.setSecure(false);
                    request.getCoyoteRequest().scheme().setString("http");
                    setPorts(request, httpServerPort);
                }
            }

            if (log.isDebugEnabled()) {
                log.debug("Incoming request " + request.getRequestURI() + " with originalRemoteAddr '" + originalRemoteAddr
                          + "', originalRemoteHost='" + originalRemoteHost + "', originalSecure='" + originalSecure + "', originalScheme='"
                          + originalScheme + "' will be seen as newRemoteAddr='" + request.getRemoteAddr() + "', newRemoteHost='"
                          + request.getRemoteHost() + "', newScheme='" + request.getScheme() + "', newSecure='" + request.isSecure() + "'");
            }
        } else {
            if (log.isDebugEnabled()) {
                log.debug("Skip RemoteIpValve for request " + request.getRequestURI() + " with originalRemoteAddr '"
                        + request.getRemoteAddr() + "'");
            }
        }
        if (requestAttributesEnabled) {
            request.setAttribute(AccessLog.REMOTE_ADDR_ATTRIBUTE,
                    request.getRemoteAddr());
            request.setAttribute(Globals.REMOTE_ADDR_ATTRIBUTE,
                    request.getRemoteAddr());
            request.setAttribute(AccessLog.REMOTE_HOST_ATTRIBUTE,
                    request.getRemoteHost());
            request.setAttribute(AccessLog.PROTOCOL_ATTRIBUTE,
                    request.getProtocol());
            request.setAttribute(AccessLog.SERVER_PORT_ATTRIBUTE,
                    Integer.valueOf(request.getServerPort()));
        }
        try {
            getNext().invoke(request, response);
        } finally {
            request.setRemoteAddr(originalRemoteAddr);
            request.setRemoteHost(originalRemoteHost);
            request.setSecure(originalSecure);
            request.getCoyoteRequest().scheme().setString(originalScheme);
            request.setServerPort(originalServerPort);

            MimeHeaders headers = request.getCoyoteRequest().getMimeHeaders();
            if (originalProxiesHeader == null || originalProxiesHeader.length() == 0) {
                headers.removeHeader(proxiesHeader);
            } else {
                headers.setValue(proxiesHeader).setString(originalProxiesHeader);
            }

            if (originalRemoteIpHeader == null || originalRemoteIpHeader.length() == 0) {
                headers.removeHeader(remoteIpHeader);
            } else {
                headers.setValue(remoteIpHeader).setString(originalRemoteIpHeader);
            }
        }
    }

...

}

爲什麼異常微服務調用了RemoteIpValve,而正常微服務沒調???

結果發現異常微服務和正常微服務的springboot版本不一樣!它們對用戶ip的處理邏輯是不一樣的 (╯‵□′)╯︵┻━┻

異常微服務用的springboot2.2.2,而正常微服務用的springboot1.5.22

springboot2.2.2有什麼特殊處理麼?是的,請看:

雖然,ServerProperties類裏配置了默認是不使用header.x-forwarded-for的

public class ServerProperties {

...

	/**
	 * Strategy for handling X-Forwarded-* headers.
	 */
	private ForwardHeadersStrategy forwardHeadersStrategy = ForwardHeadersStrategy.NONE;
	
...

}

但是,TomcatWebServerFactoryCustomizer類裏有個判斷,看當前平臺是否爲雲平臺,如果是則會使用header.x-forwarded-for

public class TomcatWebServerFactoryCustomizer implements WebServerFactoryCustomizer<ConfigurableTomcatWebServerFactory>, Ordered {

...

	private boolean getOrDeduceUseForwardHeaders() {
		if (this.serverProperties.getForwardHeadersStrategy().equals(ServerProperties.ForwardHeadersStrategy.NONE)) {
			CloudPlatform platform = CloudPlatform.getActive(this.environment);
			return platform != null && platform.isUsingForwardHeaders();
		}
		return this.serverProperties.getForwardHeadersStrategy().equals(ServerProperties.ForwardHeadersStrategy.NATIVE);
	}

	private void customizeRemoteIpValve(ConfigurableTomcatWebServerFactory factory) {
		Tomcat tomcatProperties = this.serverProperties.getTomcat();
		String protocolHeader = tomcatProperties.getProtocolHeader();
		String remoteIpHeader = tomcatProperties.getRemoteIpHeader();
		// For back compatibility the valve is also enabled if protocol-header is set
		if (StringUtils.hasText(protocolHeader) || StringUtils.hasText(remoteIpHeader)
				|| getOrDeduceUseForwardHeaders()) {
			RemoteIpValve valve = new RemoteIpValve();
			valve.setProtocolHeader(StringUtils.hasLength(protocolHeader) ? protocolHeader : "X-Forwarded-Proto");
			if (StringUtils.hasLength(remoteIpHeader)) {
				valve.setRemoteIpHeader(remoteIpHeader);
			}
			// The internal proxies default to a white list of "safe" internal IP
			// addresses
			valve.setInternalProxies(tomcatProperties.getInternalProxies());
			valve.setHostHeader(tomcatProperties.getHostHeader());
			valve.setPortHeader(tomcatProperties.getPortHeader());
			valve.setProtocolHeaderHttpsValue(tomcatProperties.getProtocolHeaderHttpsValue());
			// ... so it's safe to add this valve by default.
			factory.addEngineValves(valve);
		}
	}
...

}

springboot列舉了幾種常見的雲平臺類型:


/**
 * Simple detection for well known cloud platforms. For more advanced cloud provider
 * integration consider the Spring Cloud project.
 *
 * @author Phillip Webb
 * @since 1.3.0
 * @see "https://cloud.spring.io"
 */
public enum CloudPlatform {

	/**
	 * Cloud Foundry platform.
	 */
	CLOUD_FOUNDRY {

		@Override
		public boolean isActive(Environment environment) {
			return environment.containsProperty("VCAP_APPLICATION") || environment.containsProperty("VCAP_SERVICES");
		}

	},

	/**
	 * Heroku platform.
	 */
	HEROKU {

		@Override
		public boolean isActive(Environment environment) {
			return environment.containsProperty("DYNO");
		}

	},

	/**
	 * SAP Cloud platform.
	 */
	SAP {

		@Override
		public boolean isActive(Environment environment) {
			return environment.containsProperty("HC_LANDSCAPE");
		}

	},

	/**
	 * Kubernetes platform.
	 */
	KUBERNETES {

		private static final String SERVICE_HOST_SUFFIX = "_SERVICE_HOST";

		private static final String SERVICE_PORT_SUFFIX = "_SERVICE_PORT";

		@Override
		public boolean isActive(Environment environment) {
			if (environment instanceof ConfigurableEnvironment) {
				return isActive((ConfigurableEnvironment) environment);
			}
			return false;
		}

		private boolean isActive(ConfigurableEnvironment environment) {
			PropertySource<?> environmentPropertySource = environment.getPropertySources()
					.get(StandardEnvironment.SYSTEM_ENVIRONMENT_PROPERTY_SOURCE_NAME);
			if (environmentPropertySource instanceof EnumerablePropertySource) {
				return isActive((EnumerablePropertySource<?>) environmentPropertySource);
			}
			return false;
		}

		private boolean isActive(EnumerablePropertySource<?> environmentPropertySource) {
			for (String propertyName : environmentPropertySource.getPropertyNames()) {
				if (propertyName.endsWith(SERVICE_HOST_SUFFIX)) {
					String serviceName = propertyName.substring(0,
							propertyName.length() - SERVICE_HOST_SUFFIX.length());
					if (environmentPropertySource.getProperty(serviceName + SERVICE_PORT_SUFFIX) != null) {
						return true;
					}
				}
			}
			return false;
		}

	};

	/**
	 * Determines if the platform is active (i.e. the application is running in it).
	 * @param environment the environment
	 * @return if the platform is active.
	 */
	public abstract boolean isActive(Environment environment);

	/**
	 * Returns if the platform is behind a load balancer and uses
	 * {@literal X-Forwarded-For} headers.
	 * @return if {@literal X-Forwarded-For} headers are used
	 */
	public boolean isUsingForwardHeaders() {
		return true;
	}

	/**
	 * Returns the active {@link CloudPlatform} or {@code null} if one cannot be deduced.
	 * @param environment the environment
	 * @return the {@link CloudPlatform} or {@code null}
	 */
	public static CloudPlatform getActive(Environment environment) {
		if (environment != null) {
			for (CloudPlatform cloudPlatform : values()) {
				if (cloudPlatform.isActive(environment)) {
					return cloudPlatform;
				}
			}
		}
		return null;
	}

}

如果系統的環境變量中同時含有_SERVICE_HOST_SERVICE_PORT結尾的系統變量則認爲是KUBERNETES平臺。isUsingForwardHeaders返回true,所以CloudPlatform中所有的雲平臺都會使用header.x-forwarded-for。原因是:springboot2認爲雲平臺都有load balancer,所以X-Forwarded-For headers are used。(見isUsingForwardHeaders函數的註釋)

在這裏插入圖片描述
我們剛好用的就是KUBERNETES平臺!
系統的環境變量中剛好就是同時含有_SERVICE_HOST_SERVICE_PORT結尾的系統變量!!
DUBBO_MONITOR_27_SERVICE_PRODUCTION_SERVICE_HOST
DUBBO_MONITOR_27_SERVICE_PRODUCTION_SERVICE_PORT
DUBBO_MONITOR_SERVICE_PRODUCTION_MICRO_SERVICE_SERVICE_HOST
DUBBO_MONITOR_SERVICE_PRODUCTION_MICRO_SERVICE_SERVICE_PORT
就是這麼巧 !!!

在這裏插入圖片描述

在springboot1.5.22中的代碼與springboot2.2.2略有不同,但是邏輯類似。真正的差異是CloudPlatform的範圍不同!!

public enum CloudPlatform {

	/**
	 * Cloud Foundry platform.
	 */
	CLOUD_FOUNDRY {

		@Override
		public boolean isActive(Environment environment) {
			return environment.containsProperty("VCAP_APPLICATION") || environment.containsProperty("VCAP_SERVICES");
		}

	},

	/**
	 * Heroku platform.
	 */
	HEROKU {

		@Override
		public boolean isActive(Environment environment) {
			return environment.containsProperty("DYNO");
		}

	};

	/**
	 * Determines if the platform is active (i.e. the application is running in it).
	 * @param environment the environment
	 * @return if the platform is active.
	 */
	public abstract boolean isActive(Environment environment);

	/**
	 * Returns if the platform is behind a load balancer and uses
	 * {@literal X-Forwarded-For} headers.
	 * @return if {@literal X-Forwarded-For} headers are used
	 */
	public boolean isUsingForwardHeaders() {
		return true;
	}

	/**
	 * Returns the active {@link CloudPlatform} or {@code null} if one cannot be deduced.
	 * @param environment the environment
	 * @return the {@link CloudPlatform} or {@code null}
	 */
	public static CloudPlatform getActive(Environment environment) {
		if (environment != null) {
			for (CloudPlatform cloudPlatform : values()) {
				if (cloudPlatform.isActive(environment)) {
					return cloudPlatform;
				}
			}
		}
		return null;
	}

}

三、找到問題

1. springboot獲取用戶ip的配置

  • 默認使用remote_addr

  • 當系統運行在雲平臺上時,默認使用header.x-forwarded-for,如果想關閉需要如下配置

    server.forward-headers-strategy=FRAMEWORK
    
    // server.forward-headers-strategy的取值範圍
    public enum ForwardHeadersStrategy {
    	NATIVE,  // 強制使用header.x-forwarded-for
    	FRAMEWORK, // 強制使用remote_addr
    	NONE // 默認(普通運行使用remote_addr,雲平臺運行使用header.x-forwarded-for)
    }
    
  • 用戶可以通過配置來指定從哪個Header字段中獲取用戶ip

    server.use-forward-headers=true
    server.tomcat.protocol-header=X-Forwarded-Proto  # 這行可以省略
    server.tomcat.remote-ip-header=x-forwarded-for
    

2. springboot升級帶來的隱藏變更

springboot2.2.2可以識別的雲平臺:CLOUD_FOUNDRY、HEROKU、SAP、KUBERNETES
springboot1.5.22可以識別的雲平臺:CLOUD_FOUNDRY、HEROKU

四、結論

異常的微服務使用了springboot2.2.2並且運行在springboot可以識別的雲平臺上,所以默認從header.x-forwarded-for中獲取到了用戶真實ip。導致dubbo監控在記錄用戶ip的時候記的是用戶真實ip,從而佔用了大量的內存。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章