springcloud-單個服務無法處理新的請求

前言：
最近生產上出現一個問題，描述如下：
springcloud分佈式環境下，服務B無法處理新進來的請求，且zuul服務一直在走降級邏輯。
服務調用情況:zuul->服務A->服務B

問題分析：
1.首先查看tomcat連接是不是滿了，通過netstat -nat|grep -i “服務B端口號”|grep ESTABLISHED|wc -l，連接數爲300多，tomcat默認最大線程數配置是200，說明所有tomcat線程可能已經被使用。
2.通過jvisualvm工具，查看服務B的線程數確實在200以上，證明了第一點分析，dump內存線程，發現問題的原因是線程卡在了一段代碼執行上。

問題已經改了，但是感覺還是沒有完全解決完：
1.是什麼原因導致zuul服務走降級邏輯？
2.爲什麼zuul服務已經走降級邏輯了，但是內部服務B的tomcat連接卻還佔滿着不釋放？

針對問題1
1.根據debug了下，發現是報了hystrix超時異常導致走的降級邏輯，查看超時配置是30秒：

hystrix:
  command:
    A:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 30000

針對問題2
1.查看ribbon讀取超時配置:ribbon.ReadTimeout: 60000
2.當服務之間建立連接之後調用超過了60秒，就會報ribbon的socket read timeout異常，結合問題1分析，發現zuul請求服務A的hystrix超時配置小於ribbon的讀取超時配置，在這種情況下，當前請求超過20秒，該請求就會走降級邏輯，但是服務A調用服務B，還沒有到達ribbon的讀取超時時間，所以連接還保持着。
結論：這種配置是有點不太合理，建議服務之間調用超時配置小於hystrix超時配置。

這個問題的分析過程中涉及到了tomcat連接數及線程數概念，後來我就想：
那什麼時候tomcat連接數纔算真的滿了？如果不能建立新的請求，那這時會報錯麼？報什麼錯？

帶着疑惑寫代碼做了下驗證：
1.自定義tomcat容器配置：

@Component
public class MyEmbeddedServletContainerFactory extends TomcatEmbeddedServletContainerFactory {
    @Override
    public EmbeddedServletContainer getEmbeddedServletContainer(ServletContextInitializer... initializers){
        //設置端口
        this.setPort(8081);
        return super.getEmbeddedServletContainer(initializers);
    }

    @Override
    protected void customizeConnector(Connector connector){
        super.customizeConnector(connector);
        Http11NioProtocol protocol = (Http11NioProtocol)connector.getProtocolHandler();
        //設置最大連接數，默認10000
        protocol.setMaxConnections(50);
        //設置全連接隊列大小，默認100
        protocol.setAcceptCount(200);
        //設置最大線程數，默認200
        protocol.setMaxThreads(99);
        //設置連接超時，默認20000ms，tomcat讀取超時配置也是使用的這個值
        protocol.setConnectionTimeout(100);
    }
}

2.編寫controller:

@RestController
public class ConnectTest {
    private AtomicInteger count = new AtomicInteger();
    @GetMapping("/test")
    public void test(){
        System.out.println(count.decrementAndGet());
        try {
            //休息較長時間，讓線程阻塞掛起
            Thread.sleep(1000* 10000L);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

3.編寫測試代碼：

public class ControllerTest {
	/**線程數**/
    private static final int THREAD_SIZE = 100;
    public static void main(String[] args){
        String url = "http://127.0.0.1:8081/test";
        for (int i =0; i < THREAD_SIZE; i++){
            Thread t = new Thread(()->{
                RestTemplate template = new RestTemplate();
                template.getForEntity(url , Object.class);
            });
            t.start();
        }
    }
}

測試開始：
1.設置THREAD_SIZE=149(tomcat最大連接數+tomcat最大處理線程數)
預期：全部建連成功
結果：所有請求全部建立連接，符合預期
2.設置THREAD_SIZE=150(tomcat最大連接數+tomcat最大處理線程數)+1
預期：最後一個執行線程會建連失敗
結果：所有請求全部建立連接，不符合預期，百度了下網上也有提到accpetCount也會影響連接的上限
3.設置THREAD_SIZE=250(tomcat最大連接數+tomcat全連接隊列大小)
預期：全部建連成功
結果：所有請求全部建立連接，符合預期
4.設置THREAD_SIZE=251(tomcat最大連接數+tomcat全連接隊列大小)+1
預期：最後一個執行線程會建連失敗
結果：最後一個線程請求報錯，服務端返回connect refused錯誤，符合預期

經過反覆測試，不斷修改配置以及測試代碼最終得到一些結論：
1.tomcat容器可以接受的最大連接數計算：最大連接數+(全對列大小/最大線程數其中較大的那個值)
2.當連接滿了之後，服務端會拒絕客戶端的請求，客戶端報:connect refused錯誤。

擴展學習：

socket各種超時
1.socket連接超時

摘自java.Socket.connect方法的一段源碼：
 * Connects this socket to the server with a specified timeout value.
 * A timeout of zero is interpreted as an infinite timeout. The connection
 * will then block until established or an error occurs.

public void connect(SocketAddress endpoint, int timeout) throws IOException

2.socket讀超時

摘自java.Socket.setSoTimeout方法的一段源碼：
 *  Enable/disable {@link SocketOptions#SO_TIMEOUT SO_TIMEOUT}
 *  with the specified timeout, in milliseconds. With this option set
 *  to a non-zero timeout, a read() call on the InputStream associated with
 *  this Socket will block for only this amount of time.  If the timeout
 *  expires, a <B>java.net.SocketTimeoutException</B> is raised, though the
 *  Socket is still valid. The option <B>must</B> be enabled
 *  prior to entering the blocking operation to have effect. The
 *  timeout must be {@code > 0}.
 *  A timeout of zero is interpreted as an infinite timeout.

public synchronized void setSoTimeout(int timeout) throws SocketException

3.socket寫超時

socket沒有寫入超時這個概念，可以自己封裝實現，例如tomcat1.8使用Nio2SocketWrapper自己封裝了寫超時。

留下問題(方便自己回過頭看時想一想)：

1.tcp長連接和短連接區別?
2.怎麼理解無狀態性?
3.什麼是全連接隊列?
4.全連接隊列大小取值？
全連接隊列的大小未必是backlog的值，它是backlog與somaxconn（一個os級別的系統參數）的較小值

springcloud-單個服務無法處理新的請求

擴展學習：

留下問題(方便自己回過頭看時想一想)：

每日思考-關於合併代碼的一些思考

mysql鎖-innodb間隙鎖死鎖分析

開發工具-startUml及常用圖定義

SpringMVC-MVC框架模式

SpringMVC-簡述框架模式

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結