IOCP中多次投遞WSASend

關於IOCP中是否可以對同一socket連續投遞的疑問已經很久了，主要的疑問在wsaSend是否可以保證數據的完整發送，是否會出現部分發送成功的情況？

網上大多數的建議都是WSASEND採用線性模式，即建立一個發送緩衝，當上一次send完成之後，再進行下一次的投遞。那麼WSASEND什麼情況下會出現部分發送呢？

在MSDN中IOCP的列子是對得到的發送的字節值進行了判斷的，而在wsaSend函數的描述中也有這樣一句：Note The successful completion of a WSASend does not indicate that the data was successfully delivered.

我首先想到的是當發送緩衝區不足的時候，會不會造成wsaSend部分發送返回。做了個實驗，連續發送10M的數據（肯定大於緩衝區了）。第一次直接返回成功(對端並未進行Recv)，第二次返回IO_PENDING.看來不是這樣的。查了《windows 網絡編程技術》其中有這樣一段話：

When an application makes a send call, if there is sufficient buffer space, the data is copied into the socket's send buffers, the call completes immediately with success, and the completion is posted. On the other hand, if the socket's send buffer is full, then the application's send buffer is locked and the send call fails withWSA_IO_PENDING. After the data in the send buffer is processed (for example, handed down to TCP for processing), then Winsock will process the locked buffer directly. That is, the data is handed directly to TCP from the application's buffer and the socket's send buffer is completely bypassed。

當發送緩衝不足的時候，會內存鎖定，我另一端調用recv，收到wsasend的完成信號時，發送的字節數=要發送的字節數，並沒有部分發送。

下面是我網上找到的一片帖子忘了出自哪裏了。

---------------------------------------------------------------------------------------------------------

對於WSASend使用，一直有些疑惑，雖然對開發影響不大，但是總是很彆扭。

疑惑1：

按照MSDN的說法：
1）不必等待WSASend發送成功，可以連續調用WSASend發送數據。
2）可以給WSASend提供一個Buffer數組，一次發送多個不連續的緩衝區
3）使用WSASend發送成功後，提供的數據不保證能夠被全部發送出去

這樣是否存在這樣的問題：
假如我連續投遞了5個WSASend發送數據，如果第3個WSASend的數據沒有完全發送出去，而第4個WSASend又被接受，豈不是導致錯誤，因爲系統無法得知我的第4個WSASend何時投遞。

如果第3個發送數據的第2個參數是一個Buffer數組，我爲了發送剩餘數據，豈不要檢查到底發送了幾個Buffer？

爲了保險起見，我的項目中沒有連續投遞過WSASend，也沒有使用過多Buffer的功能，而是老老實實地在WSASend發送成功後，檢查數據是否發送完全，如果沒有，繼續發送剩餘數據，直到一次數據全部發送出去後，才發送下一個數據包。

疑惑2：

數據發送成功的含義（WSASend調用返回STATUS_SUCCESS或完成例程被調用或完成例程被調用或在完成端口上dequeue了一個完成包），可能情況：

1)數據被提交到tdi Client（AFD），就認爲數據發送成功了
2)數據被提交到到tdi Server（如TCP），加入tcp的發生隊列，就認爲數據被髮送成功了
3)數據被提交到網卡的發送緩衝區，就認爲數據發送成功了
4)數據被網卡發送出去，就認爲發送成功了
5)數據被對方成功接收，收到確認，就表示發送成功了。
以上情況到底屬於那一種呢？按照MSDN的說法，發送請求被傳輸層消費掉了，就認爲發送成功了，不知大家是如何理解這句話。

對於以上兩個疑問，網絡上也是沒有一個定論，看來要搞清楚以上兩個問題，不深入windows源碼是無解了。

先說說WSASend的調用過程吧（基於NT4源碼），源碼就不貼了，免得MS找麻煩：

WSASend->WSPSend->NtDeviceIoControlFile->AFDSend【Tdi Client】->TcpSendData【Tdi Server】->TdiSend->TcpSend->IPTransmit【Network Layer】->SendIPPacket->下面進入鏈路層，沒有找到相關源碼

NtDeviceIoControlFile：
將發送請求和完成例程被包裝成IRP，發送給"device/afd"

AFDSend:
根據buffer數組生成MDL鏈
如果TDI不支持數據緩衝，這裏要將數據緩衝下來
調用TdiBuildSend構造發送到tdi的發送請求IRP
將生成新的IRP發送到“device/tcp”
AFDSend要麼將完整數據提交到Tdi，要麼失敗，這裏不會導致發送部分數據

TcpSendData:
構造TdiRequest並調用TdiSend處理，沒有數據緩衝

TdiSend：
構造TcpRequest，並將該Request掛入TCB（TCP的傳輸控制塊）的發送隊列
調用TCPSend進一步處理
返回TDI_PENDING
該部分也不會導致數據不完整發送。

TCPSend:
檢查TCB中發送隊列的情況，決定是否啓動一次發送，如果不滿足發送條件，就返回了
如果符合發送條件，就構造TCP數據包，發送數據，這個過程比較複雜，多爲TCP協議的細節處理
可以看出，WSASend一般到TCPSend的開始部分就返回了，TCPSend本身無返回值，是由TdiSend調用完後就直接返回了Pending。

從源代碼上看，除了發送的數據的字節爲0，否則WSASend是不會返回STATUS_SUCCESS,不出錯的話，一定是返回Pending狀態

但是應用層何時收到發送成功通知呢？
我們知道，完成例程指針被存在了最上層的那個Irp裏了，在執行IoCompleteRequest的時候，完成例程會被調用，細節就不說了，檢索源代碼，有兩個地方會導致IoCompleteRequest被最終調用，一個是鏈路層調用IP層的完成例程的時候，一層層調用下去，最終導致最上層的那個IRP的完成例程被調用，另一個是再處理TcpReceive的ACK的時候，也有可能完成掉一些發送請求。

結論：
縱觀NT4源代碼，沒有發現WSASend發送部分數據的可能（也許有，我沒看出來）
基於WSASend不會發送部分數據，WSASend的確可以重疊發送（按照投遞順序將發送請求掛入TCB的發送隊列），不必串行，在一定程度上的確能夠提高效率。
所謂發送完成，應該是鏈路層調用了上層的完成例程，但是鏈路層何時調用上層的完成例程，由於源代碼缺乏，不得而知，請知情者賜教！
--------------------------------------------------------------------------------------

看來WSASend是在把請求放入TCB隊列就返回了。Google過一些英文網站，得到類似的回答

Actually, a partial overlapped send guarantees all subsequently
scheduled sends will completely fail (assuming a TCP socket).
A partial overlapped send will never happen in practice, however,
due to the implementation of the socket buffer (partial sends don't
exist in Microsoft's WinSock implementations so far). There's an
exception if you set the send buffer size to zero - then your overlapped
buffers replace the socket buffer and the socket may break part way
through an overlapped buffer.

貌似連續wsaSend是可行的。

再看MSDN 有這樣一段話

For non-overlapped sockets, the last two parameters (lpOverlapped,lpCompletionRoutine) are ignored and WSASend adopts the same blocking semantics assend. Data is copied from the buffer(s) into the transport's buffer. If the socket is non-blocking and stream-oriented, and there is not sufficient space in the transport's buffer,WSASend will return with only part of the application's buffers having been consumed. Given the same buffer situation and a blocking socket,WSASend will block until all of the application buffer contents have been consumed.

之前的理解有誤，這段話應該聯合起來理解，對於未使用overlapped的socket,最後兩個參數是被忽略的這時候wsasend表現就和send一樣，數據被拷貝到發送緩衝區，在這種情況下（wsasend像send）如果是面向流的nonblockingmodel的套接字並且發送緩衝區不足的情況下，wsasend返回拷貝到發送緩衝區的字節數，如果是blocking socket wsasend知道發送完畢才返回。（這個行爲和send是一致的，也就是只有在這種情況下wsaSend纔會部分發送）

什麼是non-overlapped sockets?之前一直把non-overlapped sockets 當做blocking socket .

再看windows網絡編程

Blocking sockets cause concern because any Winsock API call on a blocking socket can do just that—block for some period of time. Most Winsock applications follow a producer-consumer model in which the application reads (or writes) a specified number of bytes and performs some computation on that data.

Once a socket is placed in non-blocking mode, Winsock API calls that deal with sending and receiving data or connection management return immediately. In most cases, these calls fail with the error WSAEWOULDBLOCK, which means that the requested operation did not have time to complete during the call. For example, a call torecv returns WSAEWOULDBLOCK if no data is pending in the system's input buffer. Often additional calls to the same function are required until it encounters a successful return code

No –Blocking socket是不同於non-overlapped socket

之前對於overlap的理解還是有些誤區

現在的理解是 overlap是一種異步使用方式與block no blocking 不是一個感念。

overlapIO不只是socket，還包括readfile等等同樣iocp對應的也不只nonblockingsocket而是 overlapIO

Overlapped sockets merely means that you can use the sockets for overlapped IO. It doesn't mean that your socket is non-blocking.

完成端口和重疊IO的例子都沒有指定socket必須是non-blocking socket.

non-blocking socket 是指不滿足當前條件的情況下返回，當滿足當前需求，還是操作完成才返回，例如將發送緩衝區填滿。

而overlap socket是隻要當前情況不能立即執行完畢便會返回pending 也就是說在發送緩衝區填滿前就已經返回了。

在我自己的測試程序中我把發送和接受緩衝區大小都設置爲了0 所以每次發送返回都是peding，系統鎖定發送緩衝，當發送完成後得到完成通知。

所以重疊IO和完成IO 與block socket non blocking socket 是兩碼事，我接下來測試下用blocksocket 與non blockingsocket 在iocp上有什麼影響，個人現在感覺應該是一樣的。（經過測試用blocking socket iocp仍可正常工作，單non-overlapped socket不行默認的socket（）創建出的是支持overlapped的）

現在的系統使用non-blocking socket必須使用overlap模式

沒想到這篇文章寫了一年之後纔來更新呵呵寫的比較亂原諒

IOCP中多次投遞WSASend

VS 2010編譯Gh0st 3.6(三）

揭示Win32 API攔截細節/API hooking revealed (2)

一款非常不錯的dll注入器 – RemoteDLL V2

進程控制

IOCP中多次投遞WSASend

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結