技術分享-日誌鏈路追蹤

1.背景簡述

在技術運維過程中，很難從某服務龐雜的日誌中，單獨找尋出某次API調用的全部日誌。

爲提高排查問題的效率，在多個系統及應用內根據統一的TraceId 查找同一次請求鏈路上的日誌，根據日誌快速定位問題，同時需對業務代碼無侵入，特別是在高頻請求下，也可以方便的搜索此次請求的日誌內容。

本此分享一個使用MDC實現日誌鏈路跟蹤，在微服務環境中，我們經常使用Skywalking、Spring Cloud Sleut等去實現整體請求鏈路的追蹤，但是這個整體運維成本高，架構複雜，本次我們來使用MDC通過Log來實現一個輕量級的會話事務跟蹤功能,需要的朋友可以參考一下。

應用效果圖

我們知道了MDC的好處後，其實在用戶從第一時間調用請求時候，我們其實可以將請求增加tarceid一併返回，這樣用戶反饋時候，我們直接用traceid就可以全鏈路追蹤到所有請求的情況了，做到信息的閉環。

請求效果圖：

LOGBOOK效果圖：

2.關鍵思路

2.1.MDC

日誌追蹤目標是每次請求級別的，也就是說同一個接口的每次請求，都應該有不同的traceId。每次接口請求，都是一個單獨的線程，所以自然我們很容易考慮到通過ThreadLocal實現上述需求。考慮到log4j本身已經提供了類似的功能MDC，所以直接使用MDC進行實現。

關於MDC的簡述

MDC（Mapped Diagnostic Context）是一個映射，用於存儲運行上下文的特定線程的上下文數據。因此，如果使用log4j進行日誌記錄，則每個線程都可以擁有自己的MDC，該MDC對整個線程是全局的。屬於該線程的任何代碼都可以輕鬆訪問線程的MDC中存在的值。

API說明：

• clear() => 移除所有MDC

• get (String key) => 獲取當前線程MDC中指定key的值

• getContext() => 獲取當前線程MDC的MDC

• put(String key, Object o) => 往當前線程的MDC中存入指定的鍵值對

• remove(String key) => 刪除當前線程MDC中指定的鍵值對。

3.目標

1. 需要一個全服務唯一的id，即traceId，如何保證？

2. traceId如何在服務內部傳遞？

3. traceId如何在服務間傳遞？

4. traceId如何在多線程中傳遞？

4、實現方式

4.1 需要一個全服務唯一的id，即traceId，如何保證？

使用最簡單的uuid即可。複雜的話可以配置redis，雪花算法等方式。本次分享選最簡單uuid生成traceId的方式。

4.2 traceId如何在服務間傳遞？

1）在xml 的日誌格式中添加 %X{traceId} 配置。

2）新增攔截器，攔截所有請求，從 header 中獲取 traceId 然後放到 MDC 中，如果沒有獲取到，則直接用 UUID 生成一個。

@Slf4j @Component public class LogInterceptor implements HandlerInterceptor { private static final String TRACE_ID = "traceId"; @Override public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception arg3) throws Exception { } @Override public void postHandle(HttpServletRequest request, HttpServletResponse response, Object handler, ModelAndView arg3) throws Exception { } @Override public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception { String traceId = request.getHeader(TRACE_ID); if (StringUtils.isEmpty(traceId)) { MDC.put(TRACE_ID, UUID.randomUUID().toString()); } else { MDC.put(TRACE_ID, traceId); } return true; } }

3）配置攔截器

@Configuration public class WebConfig implements WebMvcConfigurer { @Resource private LogInterceptor logInterceptor; @Override public void addInterceptors(InterceptorRegistry registry) { registry.addInterceptor(logInterceptor) .addPathPatterns("/**"); } }

4.3 traceId如何在服務間傳遞？

封裝Http工具類，把traceId加入頭中，帶到下一個服務。

@Slf4j public class HttpUtils { public static String get(String url) throws URISyntaxException { RestTemplate restTemplate = new RestTemplate(); MultiValueMap<String, String> headers = new HttpHeaders(); headers.add("traceId", MDC.get("traceId")); URI uri = new URI(url); RequestEntity<?> requestEntity = new RequestEntity<>(headers, HttpMethod.GET, uri); ResponseEntity<String> exchange = restTemplate.exchange(requestEntity, String.class); if (exchange.getStatusCode().equals(HttpStatus.OK)) { log.info("send http request success"); } return exchange.getBody(); } }

4.4 traceId如何在多線程中傳遞？

spring項目也使用到了很多線程池，比如@Async異步調用，zookeeper線程池、kafka線程池等。不管是哪種線程池都大都支持傳入指定的線程池實現，

拿@Async舉例：

原理爲：

MDC底層使用TreadLocal來實現，那根據TreadLocal的特點，它是可以讓我們在同一個線程中共享數據的，但是往往我們在業務方法中，會開啓多線程來執行程序，這樣的話MDC就無法傳遞到其他子線程了。這時，我們需要使用額外的方法來傳遞存在TreadLocal裏的值。

MDC提供了一個叫getCopyOfContextMap的方法，很顯然，該方法就是把當前線程TreadLocal綁定的Map獲取出來，之後就是把該Map綁定到子線程中的ThreadLocal中了

改造Spring的異步線程池，包裝提交的任務。

@Slf4j @Component public class TraceAsyncConfigurer implements AsyncConfigurer { @Override public Executor getAsyncExecutor() { ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor(); executor.setCorePoolSize(8); executor.setMaxPoolSize(16); executor.setQueueCapacity(100); executor.setThreadNamePrefix("async-pool-"); executor.setTaskDecorator(new MdcTaskDecorator()); executor.setWaitForTasksToCompleteOnShutdown(true); executor.initialize(); return executor; } @Override public AsyncUncaughtExceptionHandler getAsyncUncaughtExceptionHandler() { return (throwable, method, params) -> log.error("asyc execute error, method={}, params={}", method.getName(), Arrays.toString(params)); } public static class MdcTaskDecorator implements TaskDecorator { @Override public Runnable decorate(Runnable runnable) { Map<String, String> contextMap = MDC.getCopyOfContextMap(); return () -> { if (contextMap != null) { MDC.setContextMap(contextMap); } try { runnable.run(); } finally { MDC.clear(); } }; } } } public class MDCLogThreadPoolExecutor extends ThreadPoolExecutor { public MDCLogThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue) { super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue); } @Override public void execute(Runnable command) { super.execute(MDCLogThreadPoolExecutor.executeRunable(command, MDC.getCopyOfContextMap())); } @Override public Future<?> submit(Runnable task) { return super.submit(MDCLogThreadPoolExecutor.executeRunable(task, MDC.getCopyOfContextMap())); } @Override public <T> Future<T> submit(Callable<T> callable) { return super.submit(MDCLogThreadPoolExecutor.submitCallable(callable,MDC.getCopyOfContextMap())); } public static Runnable executeRunable(Runnable runnable ,Map<String,String> mdcContext){ return new Runnable() { @Override public void run() { if (mdcContext == null) { MDC.clear(); } else { MDC.setContextMap(mdcContext); } try { runnable.run(); } finally { MDC.clear(); } } }; } private static <T> Callable<T> submitCallable( Callable<T> callable, Map<String, String> context) { return () -> { if (context == null) { MDC.clear(); } else { MDC.setContextMap(context); } try { return callable.call(); } finally { MDC.clear(); } }; } }

接下來需要對ThreadPoolTaskExecutor的方法進行重寫：

package com.example.demo.common.threadpool; import com.example.demo.common.constant.Constants; import lombok.extern.slf4j.Slf4j; import org.slf4j.MDC; import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor; import java.util.Map; import java.util.UUID; import java.util.concurrent.Callable; import java.util.concurrent.Future; /** * MDC線程池 * 實現內容傳遞 * * @author wangbo * @date 2021/5/13 */ @Slf4j public class MdcTaskExecutor extends ThreadPoolTaskExecutor { @Override public <T> Future<T> submit(Callable<T> task) { log.info("mdc thread pool task executor submit"); Map<String, String> context = MDC.getCopyOfContextMap(); return super.submit(() -> { T result; if (context != null) { //將父線程的MDC內容傳給子線程 MDC.setContextMap(context); } else { //直接給子線程設置MDC MDC.put(Constants.LOG_MDC_ID, UUID.randomUUID().toString().replace("-", "")); } try { //執行任務 result = task.call(); } finally { try { MDC.clear(); } catch (Exception e) { log.warn("MDC clear exception", e); } } return result; }); } @Override public void execute(Runnable task) { log.info("mdc thread pool task executor execute"); Map<String, String> context = MDC.getCopyOfContextMap(); super.execute(() -> { if (context != null) { //將父線程的MDC內容傳給子線程 MDC.setContextMap(context); } else { //直接給子線程設置MDC MDC.put(Constants.LOG_MDC_ID, UUID.randomUUID().toString().replace("-", "")); } try { //執行任務 task.run(); } finally { try { MDC.clear(); } catch (Exception e) { log.warn("MDC clear exception", e); } } }); } } 然後使用自定義的重寫子類MdcTaskExecutor來實現線程池配置： /** * 線程池配置 * * @author wangbo * @date 2021/5/13 */ @Slf4j @Configuration public class ThreadPoolConfig { /** * 異步任務線程池 * 用於執行普通的異步請求，帶有請求鏈路的MDC標誌 */ @Bean public Executor commonThreadPool() { log.info("start init common thread pool"); //ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor(); MdcTaskExecutor executor = new MdcTaskExecutor(); //配置核心線程數 executor.setCorePoolSize(10); //配置最大線程數 executor.setMaxPoolSize(20); //配置隊列大小 executor.setQueueCapacity(3000); //配置空閒線程存活時間 executor.setKeepAliveSeconds(120); //配置線程池中的線程的名稱前綴 executor.setThreadNamePrefix("common-thread-pool-"); //當達到最大線程池的時候丟棄最老的任務 executor.setRejectedExecutionHandler(new ThreadPoolExecutor.DiscardOldestPolicy()); //執行初始化 executor.initialize(); return executor; } /** * 定時任務線程池 * 用於執行自啓動的任務執行，父線程不帶有MDC標誌，不需要傳遞，直接設置新的MDC * 和上面的線程池沒啥區別，只是名字不同 */ @Bean public Executor scheduleThreadPool() { log.info("start init schedule thread pool"); MdcTaskExecutor executor = new MdcTaskExecutor(); executor.setCorePoolSize(10); executor.setMaxPoolSize(20); executor.setQueueCapacity(3000); executor.setKeepAliveSeconds(120); executor.setThreadNamePrefix("schedule-thread-pool-"); executor.setRejectedExecutionHandler(new ThreadPoolExecutor.DiscardOldestPolicy()); executor.initialize(); return executor; } }

5、擴展點

5.1 JSF接口日誌追蹤的應用

項目中也運用到了大量的jsf接口，我們其實可以按照上述的思路進行服務間的傳遞。

調用端：

// todo 不能在filter裏面這麼用 RpcContext.getContext().setAttachment("user", "zhanggeng"); RpcContext.getContext().setAttachment(".passwd", "11112222"); // "."開頭的對應上面的hide=true xxxService.yyy();// 再開始調用遠程方法 // 重要:下一次調用要重新設置，之前的屬性會被刪除 RpcContext.getContext().setAttachment("user", "zhanggeng"); RpcContext.getContext().setAttachment(".passwd", "11112222"); // "."開頭的對應上面的hide=true xxxService.zzz();// 再開始調用遠程方法

Provider端：

1. filter中直接獲取，包括標記爲hiden的參數。通過Rpccontext無法獲取。

String consumerToken = (String) invocation.getAttachment(".passwd");

1. 服務端業務代碼中直接獲取

String user = RpcContext.getContext().getAttachment("user");

tips：調用鏈中的隱式傳參

注意：在調用鏈例如A–>B–>C，A和B都要隱私傳參的時候，由於是同一個線程，會出現數據污染。例如A發參數P1給B，B收到請求拿到P1同時要發參數P2給C，那麼C會直接拿到P1,P2。這種情況，就要求B收到P1，然後設置P2調用C之前，要求自己清空上下文數據（RpcContext.getContext().clearAttachments();）

5.2 接口返回值應用

我們知道了MDC的好處後，其實在用戶從第一時間調用請求時候，我們其實可以將有誤的請求增加tarceid一併返回，這樣用戶反饋時候，我們直接用traceid就可以全鏈路追蹤到所有請求的情況了，做到信息的閉環。

效果圖：

6、備註：

各位知道了日誌追蹤的原理，其實很多應用場景可以繼續補充，例如MQ，JD的其他中間件也可以應用相同原理進行追蹤。其實，當了解了底層的原理後，我們其實就可以瞭解到JD監控中間件PFinder監控等中間件是如何做的了，本次由於時間情況，就不進行擴展了，各位可以線下去了解Skywalking 分佈式鏈路追蹤系統，就可以知道，萬變不離其宗。