Redis:排查 read error on connection 小記

從錯誤說起

版本信息

一個PHP常駐內存進程,連上Redis後,定時做brpop操作,阻塞時間爲10s。問題出現在,幾天(不定時)後,該進程就會
僵死,表現爲:

  1. netstat下,php進程與redis建立的客戶端連接仍在(ESTABLISHED)
  2. 在客戶機tcpdump,沒有輸出任何數據包信息(沒有通信?)
  3. strace該php進程,並沒有輸出任何系統調用(阻塞在哪了?)
  4. 查看redis-server,發現client list中,並不存在該client(被移除了?)

phpredis客戶端連接爲何不斷?

關於phpredis連接,有下面幾個地方需要理解清楚

  1. connect() 函數參數 timeout 爲 0
  2. ini_set(‘default_socket_timeout’, -1)
  3. setOption(\Redis::OPT_READ_TIMEOUT, -1)
  4. pconnect

connect 函數參數 timeout

參數:

  • host: string. can be a host, or the path to a unix domain socket. Starting from version 5.0.0 it is possible to specify schema
  • port: int, optional
  • timeout: float, value in seconds (optional, default is 0 meaning unlimited)
  • reserved: should be NULL if retry_interval is specified
  • retry_interval: int, value in milliseconds (optional)
  • read_timeout: float, value in seconds (optional, default is 0 meaning unlimited)

這裏的timeout表示建立連接時的超時時間,調用此函數時,客戶端將與服務端進行三次握手,建立TCP連接。由於網絡原因,可以指定一個超時時間,意思是,如果客戶端和服務端在該時間限制內未能建立連接,則返回false

文件:redis.c 行:935

PHP_METHOD(Redis, connect)
{
    if (redis_connect(INTERNAL_FUNCTION_PARAM_PASSTHRU, 0) == FAILURE) {
        RETURN_FALSE;
    } else {
        RETURN_TRUE;
    }
}

其中,redis_connect的函數原型爲

PHP_REDIS_API int redis_connect(INTERNAL_FUNCTION_PARAMETERS, int persistent);

persistent 爲 0 表示不建立持久連接,下面會聊到等於 1的情況。說明connect函數建立的是短連接,當調用close函數時,連接就會關閉。看下面的源碼確實如此,如果在建立連接前已經存在另一個連接,則關閉。

文件:redis.c 行:1011

redis = PHPREDIS_GET_OBJECT(redis_object, object);
/* if there is a redis sock already we have to remove it */
if (redis->sock) {
    redis_sock_disconnect(redis->sock, 0);
    redis_free_socket(redis->sock);
}

default_socket_timeout

這個配置可以在php.ini找到,文檔註釋很簡單:基於 socket 的流的默認超時時間(秒)

redis是基於tcp協議的程序,所以這個配置也會對其造成影響。比如read error on connection錯誤,這是phpredis在執行get、brpop等操作時,如果在default_socket_timeout時間內不返回結果就會報這個錯誤。php.ini中默認爲60s。可以在程序中使用內置函數ini_set在運行時修改。

OPT_READ_TIMEOUT

phpredis版本的“default_socket_timeout”,通過這個值,一樣可以達到同樣的效果。那麼如果同時設置了default_socket_timeoutOPT_READ_TIMEOUT,優先級是怎樣的?

實測發現,如果同時存在兩個配置,優先使用OPT_READ_TIMEOUT的配置,這樣是合理的。

文件:redis_commands.c 行:3980

case REDIS_OPT_READ_TIMEOUT:
    redis_sock->read_timeout = zval_get_double(val);
    if (redis_sock->stream) {
        read_tv.tv_sec  = (time_t)redis_sock->read_timeout;
        read_tv.tv_usec = (int)((redis_sock->read_timeout -
                                    read_tv.tv_sec) * 1000000);
        php_stream_set_option(redis_sock->stream,
                                PHP_STREAM_OPTION_READ_TIMEOUT, 0,
                                &read_tv);
    }
    RETURN_TRUE;

pconnect的原理是什麼?

文件:redis.c 行:947

PHP_METHOD(Redis, pconnect)
{
    if (redis_connect(INTERNAL_FUNCTION_PARAM_PASSTHRU, 1) == FAILURE) {
        RETURN_FALSE;
    } else {
        RETURN_TRUE;
    }
}

建立連接時,先到連接池獲取連接(最後一個),並移除最後一個連接實例。如果連接是活躍的(PHP_STREAM_OPTION_CHECK_LIVENESS),則直接返回。如果連接已失效,則建立新的連接。

文件:library.c 行:1828

if (redis_sock->persistent) {
    if (INI_INT("redis.pconnect.pooling_enabled")) {
        p = redis_sock_get_connection_pool(redis_sock);
        if (zend_llist_count(&p->list) > 0) {
            redis_sock->stream = *(php_stream **)zend_llist_get_last(&p->list);
            zend_llist_remove_tail(&p->list);
            /* Check socket liveness using 0 second timeout */
            if (php_stream_set_option(redis_sock->stream, PHP_STREAM_OPTION_CHECK_LIVENESS, 0, NULL) == PHP_STREAM_OPTION_RETURN_OK) {
                redis_sock->status = REDIS_SOCK_STATUS_CONNECTED;
                return SUCCESS;
            }
            php_stream_pclose(redis_sock->stream);
            p->nb_active--;
        }

        int limit = INI_INT("redis.pconnect.connection_limit");
        if (limit > 0 && p->nb_active >= limit) {
            redis_sock_set_err(redis_sock, "Connection limit reached", sizeof("Connection limit reached") - 1);
            return FAILURE;
        }

        gettimeofday(&tv, NULL);
        persistent_id = strpprintf(0, "phpredis_%ld%ld", tv.tv_sec, tv.tv_usec);
    } else {
        if (redis_sock->persistent_id) {
            persistent_id = strpprintf(0, "phpredis:%s:%s", host, ZSTR_VAL(redis_sock->persistent_id));
        } else {
            persistent_id = strpprintf(0, "phpredis:%s:%f", host, redis_sock->timeout);
        }
    }
    
    tv.tv_sec  = (time_t)redis_sock->timeout;
    tv.tv_usec = (int)((redis_sock->timeout - tv.tv_sec) * 1000000);
    if (tv.tv_sec != 0 || tv.tv_usec != 0) {
        tv_ptr = &tv;
    }

    redis_sock->stream = php_stream_xport_create(host, host_len,
        0, STREAM_XPORT_CLIENT | STREAM_XPORT_CONNECT,
        persistent_id ? ZSTR_VAL(persistent_id) : NULL,
        tv_ptr, NULL, &estr, &err);

    if (persistent_id) {
        zend_string_release(persistent_id);
    }

    if (!redis_sock->stream) {
        if (estr) {
            redis_sock_set_err(redis_sock, ZSTR_VAL(estr), ZSTR_LEN(estr));
            zend_string_release(estr);
        }
        return FAILURE;
    }

    if (p) p->nb_active++;

    /* Attempt to set TCP_NODELAY/TCP_KEEPALIVE if we're not using a unix socket. */
    if (!usocket) {
        php_netstream_data_t *sock = (php_netstream_data_t*)redis_sock->stream->abstract;
        err = setsockopt(sock->socket, IPPROTO_TCP, TCP_NODELAY, (char*) &tcp_flag, sizeof(tcp_flag));
        PHPREDIS_NOTUSED(err);
        err = setsockopt(sock->socket, SOL_SOCKET, SO_KEEPALIVE, (char*) &redis_sock->tcp_keepalive, sizeof(redis_sock->tcp_keepalive));
        PHPREDIS_NOTUSED(err);
    }

    php_stream_auto_cleanup(redis_sock->stream);

    read_tv.tv_sec  = (time_t)redis_sock->read_timeout;
    read_tv.tv_usec = (int)((redis_sock->read_timeout - read_tv.tv_sec) * 1000000);

    if (read_tv.tv_sec != 0 || read_tv.tv_usec != 0) {
        php_stream_set_option(redis_sock->stream,PHP_STREAM_OPTION_READ_TIMEOUT,
            0, &read_tv);
    }
    php_stream_set_option(redis_sock->stream,
        PHP_STREAM_OPTION_WRITE_BUFFER, PHP_STREAM_BUFFER_NONE, NULL);

    redis_sock->status = REDIS_SOCK_STATUS_CONNECTED;

    return SUCCESS;
}

重點來了,注意看上面代碼中這一段,先賣個關子,後面聊tcp_keepalive的時候會着重分析

/* Attempt to set TCP_NODELAY/TCP_KEEPALIVE if we're not using a unix socket. */
if (!usocket) {
    php_netstream_data_t *sock = (php_netstream_data_t*)redis_sock->stream->abstract;
    err = setsockopt(sock->socket, IPPROTO_TCP, TCP_NODELAY, (char*) &tcp_flag, sizeof(tcp_flag));
    PHPREDIS_NOTUSED(err);
    err = setsockopt(sock->socket, SOL_SOCKET, SO_KEEPALIVE, (char*) &redis_sock->tcp_keepalive, sizeof(redis_sock->tcp_keepalive));
    PHPREDIS_NOTUSED(err);
}

redis-server爲什麼會移除client?

先回顧一下TCP協議是怎麼keepalive(保活)的。

模擬tcp keepalive

開始通信

開啓一個TCP服務端

nc -lp 9999

啓動一個客戶端,連接服務端

./nckl-linux -K -O 15 -I 5 -P 5 127.0.0.1 9999

netcat-keepalive的使用參數

  • -K Turn on TCP Keepalive
  • -O secs TCP keepalive timeout
  • -I secs TCP keepalive interval
  • -P count TCP keepalive probe count

如果不設置,默認爲系統的默認配置,如linux下

sysctl -a | grep keepalive
  • net.ipv4.tcp_keepalive_time = 7200
  • net.ipv4.tcp_keepalive_probes = 9
  • net.ipv4.tcp_keepalive_intvl = 75

使用tcpdump查看發包情況

18:15:24.852471 IP localhost.45698 > localhost.9999: Flags [S], seq 253066745, win 43690, options [mss 65495,sackOK,TS val 23438901 ecr 0,nop,wscale 7], length 0
18:15:24.852510 IP localhost.9999 > localhost.45698: Flags [S.], seq 2889588682, ack 253066746, win 43690, options [mss 65495,sackOK,TS val 23438901 ecr 23438901,nop,wscale 7], length 0
18:15:24.852542 IP localhost.45698 > localhost.9999: Flags [.], ack 1, win 342, options [nop,nop,TS val 23438901 ecr 23438901], length 0

18:15:32.933719 IP localhost.45698 > localhost.9999: Flags [P.], seq 1:3, ack 1, win 342, options [nop,nop,TS val 23439709 ecr 23438901], length 2
18:15:32.933814 IP localhost.9999 > localhost.45698: Flags [.], ack 3, win 342, options [nop,nop,TS val 23439709 ecr 23439709], length 0

18:15:47.962915 IP localhost.45698 > localhost.9999: Flags [.], ack 1, win 342, options [nop,nop,TS val 23441216 ecr 23439709], length 0
18:15:47.962992 IP localhost.9999 > localhost.45698: Flags [.], ack 3, win 342, options [nop,nop,TS val 23441216 ecr 23439709], length 0
18:16:03.321743 IP localhost.45698 > localhost.9999: Flags [.], ack 1, win 342, options [nop,nop,TS val 23442752 ecr 23441216], length 0
18:16:03.321802 IP localhost.9999 > localhost.45698: Flags [.], ack 3, win 342, options [nop,nop,TS val 23442752 ecr 23439709], length 0

分三段來看,

  • 第一段:三次握手,建立連接
  • 第二段:客戶端發包,服務端應答(這裏是我在客戶端發了一個數字1)
  • 第三段:每隔15秒發一個keepalive

使用docker重現問題

docker-compose建立本地網絡

斷開服務端容器的網絡

docker network disconnect docker_network docker_redis

phpredis客戶端

這裏出現了兩種情況,分別是「已發完PSH包」和「正在發PSH包」

  1. 已發完PSH包,過一段時間,然後連續發幾次FIN_WAIT1包,最後斷開與服務端的單邊連接
  2. 正在發PSH包,不斷重試,重試幾次後,如果沒有得到服務端的確認,直接發一個F包,然後斷開與服務端的單邊連接

無論是哪一種情況,當客戶端主動斷開與服務端的連接時,都會返回一個異常 —— read error on connection,這是可以捕獲的。但是,如果在執行brpop操作,當斷開後,的確會返回該異常,然而,下一次再執行brpop的時候,就不走網絡了,因爲連接已經斷開,所以redis客戶端會直接返回false

網絡恢復?

docker模擬

docker network connect docker_network docker_redis

網絡恢復的時機也分爲兩種情況,分別對應斷開的時機

  1. 已發完PSH包,此時網絡中斷,客戶端等待1分鐘,然後開始發F包。這時,網絡恢復了!
  2. 正在發PSH包,此時網絡中斷,客戶端不斷重試,在重試結束前,網絡恢復了!

第一種情況:

16:50:53.555004 IP 2388ad577c4b.38658 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 155, ack 21, win 229, options [nop,nop,TS val 19885306 ecr 19879304], length 0
16:50:53.774621 IP 2388ad577c4b.38658 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 155, ack 21, win 229, options [nop,nop,TS val 19885328 ecr 19879304], length 0
16:50:53.995675 IP 2388ad577c4b.38658 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 155, ack 21, win 229, options [nop,nop,TS val 19885350 ecr 19879304], length 0
16:50:54.425041 IP 2388ad577c4b.38658 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 155, ack 21, win 229, options [nop,nop,TS val 19885393 ecr 19879304], length 0
16:50:55.296710 IP 2388ad577c4b.38658 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 155, ack 21, win 229, options [nop,nop,TS val 19885480 ecr 19879304], length 0
16:50:57.055424 IP 2388ad577c4b.38658 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 155, ack 21, win 229, options [nop,nop,TS val 19885656 ecr 19879304], length 0
16:51:00.495806 IP 2388ad577c4b.38658 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 155, ack 21, win 229, options [nop,nop,TS val 19886000 ecr 19879304], length 0
16:51:00.496113 IP web_docker_redis.web_docker_web_network.6379 > 2388ad577c4b.38658: Flags [P.], seq 21:26, ack 156, win 227, options [nop,nop,TS val 19886000 ecr 19886000], length 5: RESP null
16:51:00.496207 IP 2388ad577c4b.38658 > web_docker_redis.web_docker_web_network.6379: Flags [R], seq 721889775, win 0, length 0

因爲客戶端已經發了F包,就算這時候網絡恢復了,也會斷開連接,最終結果爲,客戶端異常

第二種情況:

16:59:45.126281 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 123:155, ack 21, win 229, options [nop,nop,TS val 19938525 ecr 19938424], length 32: RESP "BRPOP" "test" "3"
16:59:45.126422 IP web_docker_redis.web_docker_web_network.6379 > 2388ad577c4b.38666: Flags [.], ack 155, win 227, options [nop,nop,TS val 19938525 ecr 19938525], length 0
16:59:48.191229 IP web_docker_redis.web_docker_web_network.6379 > 2388ad577c4b.38666: Flags [P.], seq 21:26, ack 155, win 227, options [nop,nop,TS val 19938831 ecr 19938525], length 5: RESP null
16:59:48.191365 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [.], ack 26, win 229, options [nop,nop,TS val 19938831 ecr 19938831], length 0
16:59:49.196785 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 155:187, ack 26, win 229, options [nop,nop,TS val 19938932 ecr 19938831], length 32: RESP "BRPOP" "test" "3"
16:59:49.196919 IP web_docker_redis.web_docker_web_network.6379 > 2388ad577c4b.38666: Flags [.], ack 187, win 227, options [nop,nop,TS val 19938932 ecr 19938932], length 0
16:59:52.276131 IP web_docker_redis.web_docker_web_network.6379 > 2388ad577c4b.38666: Flags [P.], seq 26:31, ack 187, win 227, options [nop,nop,TS val 19939240 ecr 19938932], length 5: RESP null
16:59:52.276197 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [.], ack 31, win 229, options [nop,nop,TS val 19939240 ecr 19939240], length 0
16:59:53.156963 IP 2388ad577c4b.38662 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 219, ack 31, win 229, options [nop,nop,TS val 19939328 ecr 19930202], length 0
16:59:53.279121 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19939340 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
16:59:53.496082 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19939362 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
16:59:53.715753 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19939384 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
16:59:54.147245 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19939427 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
16:59:54.997751 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19939512 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
16:59:56.756647 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19939688 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
17:00:00.197701 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19940032 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
17:00:07.238143 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19940736 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
17:00:21.282035 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19942144 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
17:00:48.768290 IP 2388ad577c4b.38666 > web_docker_redis.web_docker_web_network.6379: Flags [P.], seq 187:219, ack 31, win 229, options [nop,nop,TS val 19944896 ecr 19939240], length 32: RESP "BRPOP" "test" "3"
17:00:48.768815 IP web_docker_redis.web_docker_web_network.6379 > 2388ad577c4b.38666: Flags [.], ack 219, win 227, options [nop,nop,TS val 19944896 ecr 19944896], length 0
17:00:51.830821 IP web_docker_redis.web_docker_web_network.6379 > 2388ad577c4b.38666: Flags [P.], seq 31:36, ack 219, win 227, options [nop,nop,TS val 19945202 ecr 19944896], length 5: RESP null

在客戶端重試發PSH包的時候,網絡恢復了,連接還在,服務端也會繼續返回結果,客戶端不再阻塞,繼續運行

解決方案:忙連接

  1. 使用php.ini的default_socket_timeout,或者phpredis的OPT_READ_TIMEOUT,設置一個自定義值,比如60s
  2. 設置connect函數的timeout爲一個自定義值,如10s
  3. 在客戶端斷開連接並報異常read error on connection時,進行異常捕獲,開啓一個阻塞循環,不斷的重連redis,只有連接成功後才返回

代碼

class PopData {
    /** @var Redis */
    private $redis = null;

    public function start()
    {
        $this->newRedis();

        while (true) {
            $data = $this->popData();
            var_dump(['data' => $data, 'time' => date('Y-m-d H:i:s', time())]);
            sleep(1);
        }
    }

    /**
     * 連接Redis
     */ 
    private function newRedis()
    {
        $this->redis = new \Redis();
        $this->redis->connect('192.168.48.4', 6379, 3);
        $this->redis->auth(123456);
        $this-redis->setOption(\Redis::OPT_READ_TIMEOUT, 60)
    }

    /**
     * brpop
     * @return array
     */
    private function popData()
    {
        try {
            // 發完Fin包後,直接從redis返回,不走網絡請求。這裏已經結束socket連接了,所以,即使網絡情況好了也不會重連
            $data = $this->redis->brPop(['test'], 3);

            return $data;

        } catch (\Exception $e) {
            // 只打印了一次
            var_dump( $e->getMessage() );

            // 進入重連邏輯
            $this->reconnect();

            // 重連成功,返回結果
            return [];
        }
    }

    /**
     * 重連redis
     */
    private function reconnect()
    {
        $isLostConnect = true;
        while($isLostConnect) {
            try {
                $this->newRedis();

                // 重連成功
                if ($this->redis->ping() === '+PONG') {
                    $isLostConnect = false;
                }
            } catch (\Exception $e) {
                var_dump($e->getMessage());

                sleep(3);
            }
        }
    }
}

系統與網絡情況

tcpdump看下,在定時重連期間,客戶端的發包情況

7:33:18.326086 IP 2388ad577c4b.38700 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 91, ack 11, win 229, options [nop,nop,TS val 20140076 ecr 20134070], length 0
17:33:18.326556 IP 2388ad577c4b.38702 > web_docker_redis.web_docker_web_network.6379: Flags [S], seq 3300121440, win 29200, options [mss 1460,sackOK,TS val 20140076 ecr 0,nop,wscale 7], length 0
17:33:18.544393 IP 2388ad577c4b.38700 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 91, ack 11, win 229, options [nop,nop,TS val 20140098 ecr 20134070], length 0
17:33:18.767654 IP 2388ad577c4b.38700 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 91, ack 11, win 229, options [nop,nop,TS val 20140120 ecr 20134070], length 0
17:33:19.194564 IP 2388ad577c4b.38700 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 91, ack 11, win 229, options [nop,nop,TS val 20140163 ecr 20134070], length 0
17:33:19.404336 IP 2388ad577c4b.38702 > web_docker_redis.web_docker_web_network.6379: Flags [S], seq 3300121440, win 29200, options [mss 1460,sackOK,TS val 20140184 ecr 0,nop,wscale 7], length 0
17:33:20.044337 IP 2388ad577c4b.38700 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 91, ack 11, win 229, options [nop,nop,TS val 20140248 ecr 20134070], length 0
17:33:21.807982 IP 2388ad577c4b.38700 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 91, ack 11, win 229, options [nop,nop,TS val 20140424 ecr 20134070], length 0
17:33:24.329065 IP 2388ad577c4b.38704 > web_docker_redis.web_docker_web_network.6379: Flags [S], seq 717381143, win 29200, options [mss 1460,sackOK,TS val 20140676 ecr 0,nop,wscale 7], length 0
17:33:25.255734 IP 2388ad577c4b.38700 > web_docker_redis.web_docker_web_network.6379: Flags [F.], seq 91, ack 11, win 229, options [nop,nop,TS val 20140769 ecr 20134070], length 0
17:33:25.403884 IP 2388ad577c4b.38704 > web_docker_redis.web_docker_web_network.6379: Flags [S], seq 717381143, win 29200, options [mss 1460,sackOK,TS val 20140784 ecr 0,nop,wscale 7], length 0
17:34:59.783849 IP 2388ad577c4b.38738 > web_docker_redis.web_docker_web_network.6379: Flags [S], seq 1730851263, win 29200, options [mss 1460,sackOK,TS val 20150126 ecr 0,nop,wscale 7], length 0
17:34:59.784023 IP web_docker_redis.web_docker_web_network.6379 > 2388ad577c4b.38738: Flags [S.], seq 1414026707, ack 1730851264, win 28960, options [mss 1460,sackOK,TS val 20150232 ecr 20150126,nop,wscale 7], length 0

可以發現,有兩個線程正在瘋狂的“試探”,一個想要結束,一個想要連接。

netstat看下,在定時重連期間,客戶端的連接狀態

tcp        0      1 192.168.48.5:38700      192.168.48.4:6379       FIN_WAIT1   -
tcp        0      1 192.168.48.5:38728      192.168.48.4:6379       SYN_SENT    682/php

由於“連接線程”是通過new Redis來實現的,所以端口會一直變化。

OPT_TCP_KEEPALIVE 到底是什麼?怎麼用?

在官方文檔中,根本找不到這個選項的說明。查看源碼發現,phpredis在建立連接時,tcp_keepalive參數默認爲 0

文件:library.c 行:1783

redis_sock->tcp_keepalive = 0;

可以通過函數setOption來設置tcp_keepalive的值

文件:redis_commands.c 行:3991

case REDIS_OPT_TCP_KEEPALIVE:

    /* Don't set TCP_KEEPALIVE if we're using a unix socket. */
    if (ZSTR_VAL(redis_sock->host)[0] == '/' && redis_sock->port < 1) {
        RETURN_FALSE;
    }
    tcp_keepalive = zval_get_long(val) > 0 ? 1 : 0;
    if (redis_sock->tcp_keepalive == tcp_keepalive) {
        RETURN_TRUE;
    }
    if (redis_sock->stream) {
        /* set TCP_KEEPALIVE */
        sock = (php_netstream_data_t*)redis_sock->stream->abstract;
        if (setsockopt(sock->socket, SOL_SOCKET, SO_KEEPALIVE, (char*)&tcp_keepalive,
                    sizeof(tcp_keepalive)) == -1) {
            RETURN_FALSE;
        }
        redis_sock->tcp_keepalive = tcp_keepalive;
    }
    RETURN_TRUE;

剛剛談pconnect的時候,聊到下面這個地方,現在着重看看

/* Attempt to set TCP_NODELAY/TCP_KEEPALIVE if we're not using a unix socket. */
if (!usocket) {
    php_netstream_data_t *sock = (php_netstream_data_t*)redis_sock->stream->abstract;
    err = setsockopt(sock->socket, IPPROTO_TCP, TCP_NODELAY, (char*) &tcp_flag, sizeof(tcp_flag));
    PHPREDIS_NOTUSED(err);
    err = setsockopt(sock->socket, SOL_SOCKET, SO_KEEPALIVE, (char*) &redis_sock->tcp_keepalive, sizeof(redis_sock->tcp_keepalive));
    PHPREDIS_NOTUSED(err);
}

在連接的時候,會通過判斷host來看是否開啓TCP_KEEPALIVE,前面在說connect函數的時候瞭解到,host由下面幾種:

host: string. can be

  • a host(ip/域名)
  • or the path to a unix domain socket. (本地域socket)
  • Starting from version 5.0.0 it is possible to specify schema

我把這句話拆開來看會比較清晰,上面這段代碼中可以看到,如果是unix domain socket,則不會啓用TCP_KEEPALIVE。然而,在connect階段,根本沒有這個配置項,也就是說,真正設置該配置的地方在別處…

docker模擬

代碼

test.php


$redis = new \Redis();
$redis->connect('192.168.80.2', 6379);
$redis->auth(123456);
$redis->setOption(\Redis::OPT_TCP_KEEPALIVE, 10);
var_dump($redis->ping());

while(true) {
    
}

通過host方式連接服務端,並設置選項OPT_TCP_KEEPALIVE10s,通過ping查看連通性,然後進行阻塞操作。lsof看下,確實使用TCP方式。

php     899 root    3u  IPv4 741397      0t0     TCP 2388ad577c4b:38782->web_docker_redis.web_docker_web_network:6379 (ESTABLISHED)

斷開服務端容器的網絡發現,在設定條件下,並不會發keepalive包,可能與docker的實現機制有關,自動轉化爲unix domain socket?目前不確定是phpredis的問題還是docker網絡機制的問題。接下來,先看看phpredis究竟有沒有執行到相應的邏輯。

非debug模式

爲了看這段代碼是否被執行到,我改一下phpredis的源碼,在這裏打印一下日誌,再重新編譯。

curl -O http://pecl.php.net/get/redis-4.0.2.tgz
tar zxvf redis-4.0.2.tgz
cd redis-4.0.2
vim library.c

找到redis_sock_connect函數,在下面的代碼中,加入打印日誌的代碼

/* Attempt to set TCP_NODELAY/TCP_KEEPALIVE if we're not using a unix socket. */
if (!usocket) {
    printf("open keepalive");
    php_netstream_data_t *sock = (php_netstream_data_t*)redis_sock->stream->abstract;
    err = setsockopt(sock->socket, IPPROTO_TCP, TCP_NODELAY, (char*) &tcp_flag, sizeof(tcp_flag));
    PHPREDIS_NOTUSED(err);
    err = setsockopt(sock->socket, SOL_SOCKET, SO_KEEPALIVE, (char*) &redis_sock->tcp_keepalive, sizeof(redis_sock->tcp_keepalive));
    PHPREDIS_NOTUSED(err);
} else {
    printf("not open keepalive");
}

這樣做發現,打印的結果是open keepalive。要想得到整個調用棧以及打印變量,不是很方便。下面使用gdb來調試,設置斷點。

debug模式

爲了使用gdb斷點調試PHP擴展,需要把PHP編譯爲debug模式,然後再把phpredis重新編譯一次

編譯php

wget -c https://github.com/php/php-src/archive/php-7.1.30.tar.gz
tar zxvf php-7.1.30.tar.gz
cd php-src-php-7.1.30
./buildconf --force
./configure \
--prefix=/usr/local/php7.1.30 \
--exec-prefix=/usr/local/php7.1.30 \
--bindir=/usr/local/php7.1.30/bin \
--sbindir=/usr/local/php7.1.30/sbin \
--includedir=/usr/local/php7.1.30/include \
--libdir=/usr/local/php7.1.30/lib/php \
--mandir=/usr/local/php7.1.30/php/man \
--with-config-file-path=/usr/local/php7.1.30/etc \
--enable-pcntl \
--with-curl \
--enable-debug \
--enable-cli
make && make install
cp php-src-php-7.1.30/php.ini-development /usr/local/php7.1.30/etc/php.ini

編譯phpredis

curl -O http://pecl.php.net/get/redis-4.0.2.tgz
tar zxvf redis-4.0.2.tgz
/usr/local/php7.1.30/bin/phpize
./configure --with-php-config=/usr/local/php7.1.30/bin/php-config
make && make install
vim /usr/local/php7.1.30/etc/php.ini
// 添加extension=redis.so到文件尾

編譯完成後,會發現安裝目錄爲 /usr/local/php7.1.30/lib/php/extensions/debug-non-zts-20160303

開始gdb調試

gdb /usr/local/php7.1.30/bin/php

Reading symbols from /usr/local/php7.1.30/bin/php...done.
(gdb) b redis_sock_connect
Function "redis_sock_connect" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (redis_sock_connect) pending.
(gdb) run test.php
Starting program: /usr/local/php7.1.30/bin/php test.php
95337
Breakpoint 1, redis_sock_connect (redis_sock=0x7ffff687e0e0) at /data/tools/redis-4.0.2/library.c:1416
1416	{
(gdb) n
1417	    struct timeval tv, read_tv, *tv_ptr = NULL;
(gdb) n
1418	    char host[1024], *persistent_id = NULL;
(gdb) n
1419	    const char *fmtstr = "%s:%d";
(gdb) n
1420	    int host_len, usocket = 0, err = 0;
(gdb) n
1422	    int tcp_flag = 1;
(gdb) n
1426	    zend_string *estr = NULL;
(gdb) n
1429	    if (redis_sock->stream != NULL) {
(gdb) n
1433	    tv.tv_sec  = (time_t)redis_sock->timeout;
(gdb) n
1434	    tv.tv_usec = (int)((redis_sock->timeout - tv.tv_sec) * 1000000);
(gdb) n
1435	    if(tv.tv_sec != 0 || tv.tv_usec != 0) {
(gdb) n
1439	    read_tv.tv_sec  = (time_t)redis_sock->read_timeout;
(gdb) n
1440	    read_tv.tv_usec = (int)((redis_sock->read_timeout-read_tv.tv_sec)*1000000);
(gdb) n
1442	    if (ZSTR_VAL(redis_sock->host)[0] == '/' && redis_sock->port < 1) {
(gdb) n
1446	        if(redis_sock->port == 0)
(gdb) n
1452	        if (strchr(ZSTR_VAL(redis_sock->host), ':') != NULL) {
(gdb) n
1456	        host_len = snprintf(host, sizeof(host), fmtstr, ZSTR_VAL(redis_sock->host), redis_sock->port);
(gdb) n
1459	    if (redis_sock->persistent) {
(gdb) n
1469	    redis_sock->stream = php_stream_xport_create(host, host_len,
(gdb) n
1473	    if (persistent_id) {
(gdb) n
1477	    if (!redis_sock->stream) {
(gdb) n
1491	    sock = (php_netstream_data_t*)redis_sock->stream->abstract;
(gdb) p persistent_id
$1 = 0x0
(gdb) n
1492	    if (!usocket) {
(gdb) p usocket
$2 = 0
(gdb) n
1493		printf("open keepalive");
(gdb) n
1494	        err = setsockopt(sock->socket, IPPROTO_TCP, TCP_NODELAY, (char*) &tcp_flag, sizeof(tcp_flag));
(gdb) n
1496	        err = setsockopt(sock->socket, SOL_SOCKET, SO_KEEPALIVE, (char*) &redis_sock->tcp_keepalive, sizeof(redis_sock->tcp_keepalive));
(gdb) p redis_sock->tcp_keepalive
$5 = 0

通過上面的gdb調試紀錄可以發現,

  1. usocket的值爲0,說明docker沒有做什麼“小動作”,host模式沒問題。
  2. connect階段,tcp_keepalive默認爲0
(gdb) b redis_setoption_handler
Function "redis_setoption_handler" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (redis_setoption_handler) pending.
(gdb) run /data/webapp/test/test.php
Starting program: /usr/local/php7.1.30/bin/php /data/webapp/test/test.php
6
open keepalive
Breakpoint 1, redis_setoption_handler (execute_data=0x7ffff6814160, return_value=0x7fffffffb000, redis_sock=0x7ffff687e0e0, c=0x0)
    at /data/tools/redis-4.0.2/redis_commands.c:3089
3089	{
(gdb) n
3095	    int tcp_keepalive = 0;
(gdb) n
3098	    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "ls", &option,
(gdb) n
3104	    switch(option) {
(gdb) n
3150	            if (ZSTR_VAL(redis_sock->host)[0] == '/' && redis_sock->port < 1) {
(gdb) p option
$1 = 6
(gdb) n
3153	            tcp_keepalive = atol(val_str) > 0 ? 1 : 0;
(gdb) p val_str
$2 = 0x7ffff6802bd8 "10"
(gdb) n
3154	            if (redis_sock->tcp_keepalive == tcp_keepalive) {
(gdb) p tcp_keepalive
$3 = 1
(gdb) p redis_sock->tcp_keepalive
$4 = 0
(gdb) n
3157	            if (redis_sock->stream) {
(gdb) n
3159	                sock = (php_netstream_data_t*)redis_sock->stream->abstract;
(gdb) n
3160	                if (setsockopt(sock->socket, SOL_SOCKET, SO_KEEPALIVE, (char*)&tcp_keepalive,
(gdb) n
3164	                redis_sock->tcp_keepalive = tcp_keepalive;
(gdb) p redis_sock->tcp_keepalive
$5 = 0
(gdb) p tcp_keepalive
$6 = 1
(gdb) n
3166	            RETURN_TRUE;

通過上面的調試可以知道,

  • 在調用setOption函數階段,成功設置了tcp_keepalive1

疑問

前面我們通過docker模擬,gdb斷點排查,現在進行小結:

  1. 版本問題:一開始懷疑是phpredis沒有TCP_KEEPALIVE的配置項,查看源碼發現4.0以上的版本都支持了。
  2. 環境問題:通過gdb斷點發現,host是沒問題的,並沒有採用unix domain socket模式,在docker環境下模擬沒問題。
  3. 邏輯問題:通過gdb斷點發現,在connect階段,sock->tcp_keepalive默認爲0,在setOption階段,sock->tcp_keepalive被設置爲1,邏輯也沒問題

到現在,幾乎任何關於代碼的地方都“似乎”沒問題,所以走不通了,只能回頭再看看,有什麼細節遺漏了。前面,我們在setOption階段,把OPT_TCP_KEEPALIVE設置爲10,當時我說,把時間設置爲10s,因爲我把這裏理所當然的理解爲tcp_keepalive_time,我希望在斷網後10秒內,能給服務端發keepalive包。可是,查看源碼發現,

tcp_keepalive = zval_get_long(val) > 0 ? 1 : 0;

這裏傳入的值,似乎被當作了另一種用法,只要是正整數,就把tcp_keepalive設置爲1,否則設置爲0。也就是說,這裏並沒有tcp_keepalive_time的功能,僅作爲開關!!!

但是,我找不到任何提供的API可以設置了…

設置系統默認TCP_KEEPALIVE各參數值

前面我們知道,系統有一個全局默認的TCP_KEEPALIVE配置

sysctl -a | grep keepalive

net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200

上面這個配置是兩個小時(7200s)後才發包,現在我把這些設置改一下,改短一點

sysctl -w net.ipv4.tcp_keepalive_time=15 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10
  • net.ipv4.tcp_keepalive_time:15
  • net.ipv4.tcp_keepalive_probes:3
  • net.ipv4.tcp_keepalive_intvl:10

重新跑一遍代碼,斷開服務端網絡,tcpdump看發包情況。

15:38:24.862503 IP web_docker_php.web_docker_web_network.42480 > ce6e2fa39930.6379: Flags [.], ack 13, win 229, options [nop,nop,TS val 35239808 ecr 35238270], length 0
15:38:24.862592 IP ce6e2fa39930.6379 > web_docker_php.web_docker_web_network.42480: Flags [.], ack 41, win 227, options [nop,nop,TS val 35239808 ecr 35238275], length 0
15:38:39.866247 IP web_docker_php.web_docker_web_network.42480 > ce6e2fa39930.6379: Flags [.], ack 13, win 229, options [nop,nop,TS val 35241312 ecr 35239808], length 0
15:38:39.866290 IP ce6e2fa39930.6379 > web_docker_php.web_docker_web_network.42480: Flags [.], ack 41, win 227, options [nop,nop,TS val 35241312 ecr 35238275], length 0
15:38:54.907073 IP web_docker_php.web_docker_web_network.42480 > ce6e2fa39930.6379: Flags [.], ack 13, win 229, options [nop,nop,TS val 35242816 ecr 35241312], length 0
15:38:54.907178 IP ce6e2fa39930.6379 > web_docker_php.web_docker_web_network.42480: Flags [.], ack 41, win 227, options [nop,nop,TS val 35242816 ecr 35238275], length 0

重新試一下發現,竟然沒問題了!確實每隔15秒發一次keepalive包。也就是說,我一直對phpredis的TCP_KEEPALIVE用法理解錯了。先入爲主的認爲這個就是tcp_keepalive_time。其實,之前的程序一直沒有問題,只不過,因爲系統默認的時間太久了,程序一直阻塞着,所以我才覺得這個參數沒有正確被設置。

更簡單的方案?

前面討論瞭解決brpop在網絡抖動的情況下,使用忙連接的方案。後來,我們瞭解了OPT_TCP_KEEPALIVE的用法,能不能有更簡單的方案?要是phpredis客戶端能定時發keepalive包,如果網絡中斷,直接報異常,然後進行異常捕獲,重新連接。豈不是更佳?

然而,在實測過程中(使用test.php),當網絡中斷後,客戶端便不再發送keepalive包,通過netstat看,客戶端在短時間內自動斷開客戶端與服務端的單邊連接,然後也沒有報異常:(

總結

  1. 使用nc和netcat-keepalive工具,回顧TCP_KEEPALIVE機制
  2. 理清redis幾個關於timeout的API,以及結合使用時它們的優先級
  3. 理清phpredis客戶端keepalive用法,沒有開放TCP_KEEPALIVE的三個關鍵配置,而是僅作爲開關,使用系統環境的參數配置
  4. 把網絡異常當作常態,在應用層做更健壯的長連接檢測

最後

本文使用Redis的brpop做消息獲取,這只是其中一種情況,還有其他網絡API也是需要長連接的,如subscribe,針對其他API,解決方案是否如出一轍呢?留到下一次繼續分析~

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章