作者:henrystark [email protected]
Blog: http://henrystark.blog.chinaunix.net/
日期:20140419
本文遵循CC協議:署名-非商業性使用-禁止演繹 2.5(https://creativecommons.org/licenses/by-nc-nd/2.5/cn/)。可以自由拷貝,轉載。但轉載請保持文檔的完整性,註明原作者及原鏈接。如有錯訛,煩請指出。
LinuxTCP shutdown和close系統調用
0.寫作目的
面試時被問及系統調用如何實現,這個問題不好說。往深處說,牽涉到NR……等中斷向量的實現【引 3】;往淺了說,就是系統提供的接口在內核代碼如何實現。我最開始說了printf和write系統調用的關係,說到一半接不下去了【注 1】。於是該說shutdown和close兩個系統調用。
1.引言
socket網絡編程中,常用這兩個系統調用,最主要的區別是:shutdown強制關閉套接字,close只將引用計數減一。
2.功能和代碼
2.1 shutdown
準確的定義見【引 1】。該函數有三種關閉方式:單獨關閉讀(寫)、同時關閉讀寫。shutdown處理過程調用序列見【引 2】。shutdown不管引用計數,會直接關閉套接口。源碼如下:
linux/net/ipv4/tcp.c /* * Shutdown the sending side of a connection. Much like close except * that we don't receive shut down or sock_set_flag(sk, SOCK_DEAD). */ void tcp_shutdown(struct sock *sk, int how) { /* We need to grab some memory, and put together a FIN, * and then put it into the queue to be sent. * Tim MacKenzie([email protected]) 4 Dec '92. */ if (!(how & SEND_SHUTDOWN)) return; /* If we've already sent a FIN, or it's a closed state, skip this. */ if ((1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_SYN_SENT | TCPF_SYN_RECV | TCPF_CLOSE_WAIT)) { /* Clear out any half completed packets. FIN if needed. */ if (tcp_close_state(sk)) tcp_send_fin(sk); } }
從註釋中可以看到,這個函數主要負責關閉套接口的讀端。注意,這裏爲了處理用位與的方式來判斷是否是關閉讀端,how變量已經經過了處理,見shutdown系統調用在套接口層的實現inet_shutdown。
linux/net/ipv4/af_inet.c int inet_shutdown(struct socket *sock, int how) { struct sock *sk = sock->sk; int err = 0; /* This should really check to make sure * the socket is a TCP socket. (WHY AC...) */ how++; /* maps 0->1 has the advantage of making bit 1 rcvs and 1->2 bit 2 snds. 2->3 */ if ((how & ~SHUTDOWN_MASK) || !how) /* MAXINT->0 */ return -EINVAL; ……………………………………………………………………………………………………………………………………………………………………………… linux/include/net/sock.h #define SHUTDOWN_MASK 3 #define RCV_SHUTDOWN 1 #define SEND_SHUTDOWN 2
問題是,讀端怎麼關閉?實際上,shutdown導致進程丟棄沒有讀取的或者後續到達的數據。這會在其他tcp接收函數中做處理,如tcp_poll、tcp_recvmsg等。
2.2 close
close系統調用的減引用計數操作主要由release函數完成,該函數最後調用close函數處理數據併發送fin。
linux/net/ipv4/af_inet.c int inet_release(struct socket *sock) { struct sock *sk = sock->sk; if (sk) { long timeout; //以下兩個函數實現引用計數-1 sock_rps_reset_flow(sk); /* Applications forget to leave groups before exiting */ ip_mc_drop_socket(sk); /* If linger is set, we don't return until the close * is complete. Otherwise we return immediately. The * actually closing is done the same either way. * * If the close is due to the process exiting, we never * linger.. */ timeout = 0; if (sock_flag(sk, SOCK_LINGER) && !(current->flags & PF_EXITING)) timeout = sk->sk_lingertime; sock->sk = NULL; sk->sk_prot->close(sk, timeout); //這裏調用tcp_close() } return 0; } linux/net/ipv4/tcp.c void tcp_close(struct sock *sk, long timeout) { …………………………………………………………………………………………………………………………………………………… if (data_was_unread) { /* Unread data was tossed, zap the connection. */ NET_INC_STATS_USER(sock_net(sk), LINUX_MIB_TCPABORTONCLOSE); tcp_set_state(sk, TCP_CLOSE); tcp_send_active_reset(sk, sk->sk_allocation); } else if (sock_flag(sk, SOCK_LINGER) && !sk->sk_lingertime) { /* Check zero linger _after_ checking for unread data. */ sk->sk_prot->disconnect(sk, 0); NET_INC_STATS_USER(sock_net(sk), LINUX_MIB_TCPABORTONDATA); } else if (tcp_close_state(sk)) { tcp_send_fin(sk); //這裏發送fin } sk_stream_wait_close(sk, timeout); adjudge_to_death: state = sk->sk_state; sock_hold(sk); sock_orphan(sk); /* It is the last release_sock in its life. It will remove backlog. */ release_sock(sk); …………………………………………………………………………………………………………………………………………………………………………………………………… }
可以看到,shutdown和close兩個系統調用最後都使用了send_fin函數來終止連接。
3.系統調用的實現機制
【引 3】中有系統調用的詳細實現機制。在內核中定義系統調用編號,應用程序用軟中斷通知系統切換到內核態,傳遞參數。
引用:
【1】shutdown函數說明。http://pubs.opengroup.org/onlinepubs/007908799/xns/shutdown.html。
【2】shutdown調用序列,形參定義稍有不同。http://www.ibm.com/developerworks/cn/aix/library/au-tcpsystemcalls/#shutdown。
【3】系統調用如何實現。http://blog.chinaunix.net/uid-20321537-id-1966859.html.
註解:
【1】printf是庫函數,write是系統調用,關於系統調用和庫函數的區別,也很複雜,【引 3】中講了一部分,關於printf的實現細節參見http://blog.csdn.net/dog250/article/details/23000909。