問題背景
OP_WRITE事件是在Socket發送緩衝區中的可用字節數大於或等於其低水位標記SO_SNDLOWAT時發生。正常情況下,都是可寫的,因此一般不註冊寫事件。所以一般代碼如下:
while (bb.hasRemaining()) {
int len = socketChannel.write(bb);
if (len < 0) {
throw new EOFException();
}
}
這樣在大部分情況都沒問題,但是高併發,並且在網絡環境很差的情況下,發送緩衝區可能會滿,導致無限循環,這樣最終會導致CPU利用率100%。下面就看看一些基於NIO的框架,是如何處理這個問題的。
Spymemcached的處理方式:
private void handleWrites(SelectionKey sk, MemcachedNode qa)
throws IOException {
// 填充寫緩衝區
qa.fillWriteBuffer(shouldOptimize);
boolean canWriteMore = qa.getBytesRemainingToWrite() > 0;
while (canWriteMore) {
int wrote = qa.writeSome();
qa.fillWriteBuffer(shouldOptimize);
// 如果wrote等於零,表示沒有寫出數據,那麼不再嘗試寫,等待下次線程外層循環註冊write事件
canWriteMore = wrote > 0 && qa.getBytesRemainingToWrite() > 0;
}
public final int writeSome() throws IOException {
int wrote = channel.write(wbuf);
// 寫入多少個字節,toWrite就減去對應的數量
toWrite -= wrote;
return wrote;
}
public final int getSelectionOps() {
int rv = 0;
if (getChannel().isConnected()) {
if (hasReadOp()) {
rv |= SelectionKey.OP_READ;
}
// 如果toWrite大於0,說明由於某種異常原因上次寫入還未完成;hasWriteOp()用於判斷寫隊列是否還有元素。這兩種情況下,需要註冊寫事件。本文討論的是toWrite>0的情況。
if (toWrite > 0 || hasWriteOp()) {
rv |= SelectionKey.OP_WRITE;
}
} else {
rv = SelectionKey.OP_CONNECT;
}
return rv;
}
說明:Spymemcached是單線程的,因此就是絕對不能阻塞,所以當發現不可寫的時候,不能阻塞住線程,而是立即返回,等待下次主線程循環來註冊事件。
Netty的處理方式:
protected void write0(AbstractNioChannel<?> channel) {
boolean open = true;
boolean addOpWrite = false;
boolean removeOpWrite = false;
boolean iothread = isIoThread(channel);
long writtenBytes = 0;
final SocketSendBufferPool sendBufferPool = this.sendBufferPool;
final WritableByteChannel ch = channel.channel;
final Queue<MessageEvent> writeBuffer = channel.writeBufferQueue;
final int writeSpinCount = channel.getConfig().getWriteSpinCount();
List<Throwable> causes = null;
synchronized (channel.writeLock) {
channel.inWriteNowLoop = true;
for (;;) {
MessageEvent evt = channel.currentWriteEvent;
SendBuffer buf = null;
ChannelFuture future = null;
try {
if (evt == null) {
if ((channel.currentWriteEvent = evt = writeBuffer.poll()) == null) {
// 如果無數據可寫,則需要刪除可寫事件的註冊
removeOpWrite = true;
channel.writeSuspended = false;
break;
}
future = evt.getFuture();
channel.currentWriteBuffer = buf = sendBufferPool.acquire(evt.getMessage());
} else {
future = evt.getFuture();
buf = channel.currentWriteBuffer;
}
long localWrittenBytes = 0;
// 通過writeSpinCount來控制嘗試寫的次數,如果最終還是無法寫入,就註冊寫事件
for (int i = writeSpinCount; i > 0; i --) {
// 寫數據
localWrittenBytes = buf.transferTo(ch);
// 如果寫入數據不等於零,表明寫入成功,跳出循環
if (localWrittenBytes != 0) {
writtenBytes += localWrittenBytes;
break;
}
// 如果buf的數據都寫完了,則跳出循環
if (buf.finished()) {
break;
}
}
if (buf.finished()) {
// Successful write - proceed to the next message.
buf.release();
channel.currentWriteEvent = null;
channel.currentWriteBuffer = null;
// Mark the event object for garbage collection.
//noinspection UnusedAssignment
evt = null;
buf = null;
future.setSuccess();
} else {
// Not written fully - perhaps the kernel buffer is full.
addOpWrite = true;
channel.writeSuspended = true;
if (writtenBytes > 0) {
// Notify progress listeners if necessary.
future.setProgress(
localWrittenBytes,
buf.writtenBytes(), buf.totalBytes());
}
break;
}
}
}
channel.inWriteNowLoop = false;
if (open) {
if (addOpWrite) {
// 註冊寫事件
setOpWrite(channel);
} else if (removeOpWrite) {
// 刪除寫事件
clearOpWrite(channel);
}
}
}
}
說明:Netty是多線程的,因此其可以通過阻塞線程做一定的等待,等待通道可寫。Netty等待是通過spinCount等待指定的循環次數。
Grizzly(誕生子Glass Fish項目)的處理方式:
public static long flushChannel(SocketChannel socketChannel, ByteBuffer bb, long writeTimeout)
throws IOException {
SelectionKey key = null;
Selector writeSelector = null;
int attempts = 0;
int bytesProduced = 0;
try {
while (bb.hasRemaining()) {
int len = socketChannel.write(bb);
// 類似Netty的spinCount
attempts++;
if (len < 0) {
throw new EOFException();
}
bytesProduced += len;
if (len == 0) {
if (writeSelector == null) {
// 獲取一個新的selector
writeSelector = SelectorFactory.getSelector();
if (writeSelector == null) {
// Continue using the main one
continue;
}
}
// 在新selector上註冊寫事件,而不是在主selector上註冊
key = socketChannel.register(writeSelector, key.OP_WRITE);
// 利用writeSelector.select()來阻塞當前線程,等待可寫事件發生,總共等待可寫事件的時長是3*writeTimeout
if (writeSelector.select(writeTimeout) == 0) {
if (attempts > 2)
throw new IOException("Client disconnected");
} else {
attempts--;
}
} else {
attempts = 0;
}
}
}
return bytesProduced;
}
說明:Grizzly是多線程的,因此其可以做合適的阻塞等待。其沒有再主selector上註冊寫事件,而是在重新構造的selector上註冊寫事件,並且通過select()來阻塞一定的時間來等待可寫。
爲什麼要這麼做呢?Grizzly的作者對此的迴應如下:
1. 使用臨時的Selector的目的是減少線程間的切換。當前的Selector一般用來處理OP_ACCEPT,和OP_READ的操作。使用臨時的Selector可減輕主Selector的負擔;而在註冊的時候則需要進行線程切換,會引起不必要的系統調用。這種方式避免了線程之間的頻繁切換,有利於系統的性能提高。
2. 雖然writeSelector.select(writeTimeout)做了阻塞操作,但是這種情況只是少數極端的環境下才會發生。> 大多數的客戶端是不會頻繁出現這種現象的,因此在同一時刻被阻塞的線程不會很多。
3. 利用這個阻塞操作來判斷異常中斷的客戶連接。
4. 經過壓力實驗證明這種實現的性能是非常好的。