[轉載]TCP鏈接主動關閉不發fin包奇怪行爲分析

以前在寫網絡通訊模塊的時候,抓包時候也發現這個問題:“TCP鏈接主動關閉不發fin包”;

今天看了以下文章,Mark. 轉自:http://rdc.taobao.com/blog/cs/?p=1055


我們在1 8 . 2節中看到終止一個連接的正常方式是一方發送F I N。有時這也稱爲有序釋放
(orderly release),因爲在所有排隊數據都已發送之後才發送F I N,正常情況下沒有任何數據
丟失。但也有可能發送一個復位報文段而不是F I N來中途釋放一個連接。有時稱這爲異常釋放
(abortive release)。
異常終止一個連接對應用程序來說有兩個優點:(1)丟棄任何待發數據並立即發送復位
報文段;(2)R S T的接收方會區分另一端執行的是異常關閉還是正常關閉。應用程序使用的
A P I必須提供產生異常關閉而不是正常關閉的手段。

問題描述:
多隆同學在做網絡框架的時候,發現一條tcp鏈接在close的時候,對端會收到econnrest,而不是正常的fin包. 通過抓包發現close系統調用的時候,我端發出rst報文, 而不是正常的fin。這個問題比較有意思,我們來演示下:


$ erl
Erlang R14B03 (erts-5.8.4) [64-bit] [smp:16:16] [rq:16] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.8.4 (abort with ^G)
1> {ok,Sock} = gen_tcp:connect("baidu.com", 80, [{active,false}]).
{ok,#Port<0.582>}
2> gen_tcp:send(Sock, "GET / HTTP/1.1\r\n\r\n").
ok
3> gen_tcp:close(Sock).
ok

我們往baidu的首頁發了個http請求,百度會給我們迴應報文的,我們send完立即調用close.

然後我們在另外一個終端開tcpdump抓包確認:

$ sudo tcpdump port 80 -i bond0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 96 bytes
17:22:38.246507 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: S 2228211568:2228211568(0) win 5840 
17:22:38.284602 IP 220.181.111.86.http > my031089.sqa.cm4.tbsite.net.19500: S 3250338304:3250338304(0) ack 2228211569 win 8190 
17:22:38.284624 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: . ack 1 win 5840
17:22:52.748468 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: P 1:19(18) ack 1 win 5840
17:22:52.786855 IP 220.181.111.86.http > my031089.sqa.cm4.tbsite.net.19500: . ack 19 win 5840
17:22:52.787194 IP 220.181.111.86.http > my031089.sqa.cm4.tbsite.net.19500: P 1:179(178) ack 19 win 5840
17:22:52.787203 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: . ack 179 win 6432
17:22:52.787209 IP 220.181.111.86.http > my031089.sqa.cm4.tbsite.net.19500: P 179:486(307) ack 19 win 5840
17:22:52.787214 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: . ack 486 win 7504
17:23:01.564358 IP my031089.sqa.cm4.tbsite.net.19500 > 220.181.111.86.http: R 19:19(0) ack 486 win 7504
...

我們可以清楚的看到 R 19:19(0) ack 486 win 7504,發了個rst包,通過strace系統調用也確認erlang確實調用了close系統調用。
那爲什麼呢? @淘寶雕樑,tcp協議棧專家回答了這個問題:

在net/ipv4/tcp.c:1900附近

...
/* As outlined in RFC 2525, section 2.17, we send a RST here because
* data was lost. To witness the awful effects of the old behavior of
* always doing a FIN, run an older 2.1.x kernel or 2.0.x, start a bulk
* GET in an FTP client, suspend the process, wait for the client to
* advertise a zero window, then kill -9 the FTP client, wheee...
* Note: timeout is always zero in such a case.
*/
if (data_was_unread) {
/* Unread data was tossed, zap the connection. */
NET_INC_STATS_USER(sock_net(sk), LINUX_MIB_TCPABORTONCLOSE);
tcp_set_state(sk, TCP_CLOSE);
tcp_send_active_reset(sk, sk->sk_allocation);
..

代碼裏面寫的很清楚,如果你的接收緩衝去還有數據,協議棧就會發rst代替fin.
我們再來驗證一下:


$ erl
Erlang R14B03 (erts-5.8.4) [64-bit] [smp:16:16] [rq:16] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.8.4 (abort with ^G)
1> {ok,Sock} = gen_tcp:connect("baidu.com", 80, [{active,false}]).
{ok,#Port<0.582>}
2> gen_tcp:send(Sock, "GET / HTTP/1.1\r\n\r\n").
ok
3> gen_tcp:recv(Sock,0).
{ok,"HTTP/1.1 400 Bad Request\r\nDate: Fri, 01 Jul 2011 09:24:37 GMT\r\nServer: Apache\r\nConnection: Keep-Alive\r\nTransfer-Encoding: chunked\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n127\r\n\n\n\n\n

Bad Request

\nYour browser sent a request that this server could not understand.

\nclient sent HTTP/1.1 request without hostname (see RFC2616 section 14.23): /

\n\n\r\n0\r\n\r\n"}
4> gen_tcp:close(Sock).
ok
5>

這次我們把接收緩衝區裏的東西拉乾淨了。

再看下tcpdump:

...
17:36:07.236627 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: S 3086473299:3086473299(0) win 5840 
17:36:07.274661 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: S 738551248:738551248(0) ack 3086473300 win 8190 
17:36:07.274685 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: . ack 1 win 5840
17:36:10.295795 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: P 1:19(18) ack 1 win 5840
17:36:10.334280 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: . ack 19 win 5840
17:36:10.334547 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: P 1:179(178) ack 19 win 5840
17:36:10.334554 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: . ack 179 win 6432
17:36:10.334563 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: P 179:486(307) ack 19 win 5840
17:36:10.334566 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: . ack 486 win 7504
17:36:19.671374 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: F 19:19(0) ack 486 win 7504
17:36:19.709619 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: . ack 20 win 5840
17:36:19.709643 IP 123.125.114.144.http > my031089.sqa.cm4.tbsite.net.9405: F 486:486(0) ack 20 win 5840
17:36:19.709652 IP my031089.sqa.cm4.tbsite.net.9405 > 123.125.114.144.http: . ack 487 win 7504
...

這次是發fin包了。

多隆同學再進一步,找出來之前squid client代碼中不能理解的一句話:
client_side.c

...
/* prevent those nasty RST packets */
{
char buf[SQUID_TCP_SO_RCVBUF];
while (FD_READ_METHOD(fd, buf, SQUID_TCP_SO_RCVBUF) > 0);
}
...

總算明白了這句話的意思了!

小結:認真學習協議棧太重要了。

玩得開心!

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章