文章目錄
發送隊列積壓導致內存泄漏
上個文章模擬高併發發送消息導致內存泄漏,分析了沒有設置高水位導致的內存泄漏,其實並不是在高併發時候纔會導致積壓,在別的場景下也會導致積壓。
其他可能導致發送消息隊列積壓的因素
在一些場景下,儘管系統流量不大,但任然可能導致消息積壓,可能的場景如下:
- 網絡瓶頸導致積壓,當發送速度超過網絡鏈接處理能力,會導致發送隊列積壓。
- 當對端讀取速度小於乙方發送速度,導致自身TCP發送緩衝區滿,頻繁發生write 0字節時,待發送消息會在Netty發送隊列中排隊。
當出現大量排隊時,很容易導致Netty的直接內存泄漏,對案例中的代碼做改造,模擬直接內存泄漏。
客戶端代碼改造
客戶端每1ms發送一條消息,服務端不讀取網絡消息會導致客戶端的發送隊列積壓。
客戶端代碼改造如下:
public class LoadRunnerSleepClientHandler extends ChannelInboundHandlerAdapter {
private final ByteBuf firstMessage;
Runnable loadRunner;
static final int SIZE = Integer.parseInt(System.getProperty("size", "10240"));
public LoadRunnerSleepClientHandler() {
firstMessage = Unpooled.buffer(SIZE);
for (int i = 0; i < firstMessage.capacity(); i ++) {
firstMessage.writeByte((byte) i);
}
}
@Override
public void channelActive(final ChannelHandlerContext ctx) {
loadRunner = new Runnable() {
@Override
public void run() {
try {
TimeUnit.SECONDS.sleep(30);
} catch (InterruptedException e) {
e.printStackTrace();
}
ByteBuf msg = null;
while(true)
{
byte [] body = new byte[SIZE];
msg = Unpooled.wrappedBuffer(body);
ctx.writeAndFlush(msg);
try {
//模擬每1ms發送一條消息
TimeUnit.MILLISECONDS.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
};
new Thread(loadRunner, "LoadRunner-Thread").start();
}
@Override
public void channelRead(ChannelHandlerContext ctx, Object msg)
{
ReferenceCountUtil.release(msg);
}
@Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) {
cause.printStackTrace();
ctx.close();
}
}
public class LoadRunnerClient {
static final String HOST = System.getProperty("host", "127.0.0.1");
static final int PORT = Integer.parseInt(System.getProperty("port", "8080"));
@SuppressWarnings({"unchecked", "deprecation"})
public static void main(String[] args) throws Exception {
EventLoopGroup group = new NioEventLoopGroup();
try {
Bootstrap b = new Bootstrap();
b.group(group)
.channel(NioSocketChannel.class)
.option(ChannelOption.TCP_NODELAY, true)
//設置請求的高水位
.option(ChannelOption.WRITE_BUFFER_HIGH_WATER_MARK, 10 * 1024 * 1024)
.handler(new ChannelInitializer<SocketChannel>() {
@Override
public void initChannel(SocketChannel ch) throws Exception {
ChannelPipeline p = ch.pipeline();
p.addLast(new LoadRunnerSleepClientHandler());
}
});
ChannelFuture f = b.connect(HOST, PORT).sync();
f.channel().closeFuture().sync();
} finally {
group.shutdownGracefully();
}
}
}
服務端代碼
服務端在消息接收處Debug,模擬服務端處理慢,不讀網絡消息。由於服務端不讀取網絡消息,會導致客戶端的發送隊列積壓。
public final class LoadRunnerServer {
static final int PORT = Integer.parseInt(System.getProperty("port", "8080"));
public static void main(String[] args) throws Exception {
EventLoopGroup bossGroup = new NioEventLoopGroup(1);
EventLoopGroup workerGroup = new NioEventLoopGroup();
try {
ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup)
.channel(NioServerSocketChannel.class)
.option(ChannelOption.SO_BACKLOG, 100)
.handler(new LoggingHandler(LogLevel.INFO))
.childHandler(new ChannelInitializer<SocketChannel>() {
@Override
public void initChannel(SocketChannel ch) throws Exception {
ChannelPipeline p = ch.pipeline();
p.addLast(new EchoServerHandler());
}
});
ChannelFuture f = b.bind(PORT).sync();
f.channel().closeFuture().sync();
} finally {
bossGroup.shutdownGracefully();
workerGroup.shutdownGracefully();
}
}
}
class EchoServerHandler extends ChannelInboundHandlerAdapter {
@Override
public void channelRead(ChannelHandlerContext ctx, Object msg) {
ctx.write(msg);
}
@Override
public void channelReadComplete(ChannelHandlerContext ctx) {
ctx.flush();
}
@Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) {
// 發生異常關閉連接
cause.printStackTrace();
ctx.close();
}
}
服務端在chnnelRead中設置斷點,模擬阻塞NioEventLoop線程,因爲Netty在發送消息時會把堆內存轉化成直接內存,通過對內存監控無法直接看到直接內存的分配和使用情況,運行一段時間之後可以在客戶端AbstractChannel.AbstractUnsafe的write處設置斷點,查看發送隊列ChannelOutboundBuffer堆積情況。
利用netstat -ano等命令可以監控到某個端口的TCP接收和發送隊列的積壓情況,一旦發現自己的發送隊列有大量的積壓,說明消息的收發存在瓶頸,需要及時解決,防止因Netty發送隊列積壓導致內存泄漏,在日常監控中,需要將Netty的鏈路數,網絡讀寫速速等指標納入監控系統,發現問題之後需要及時告警。
Netty消息發送工作機制和源碼分析
業務調用write方法後,經過ChnnelPipline職責鏈處理。消息被投遞到發送緩存區待發送,調用flush之後會執行真正的發送操作,底層通過調用JavaNIO的SocketChannel進行非阻塞write操作,將消息發送到網絡上。
Netty的消息發送涉及以下考慮,實現比較複雜。
- 線程切換,
- 消息隊列,
- 高低水位和半包消息,
WriteAndFlushTask原理和源碼分析
爲了儘可能提升性能,Netty採用了串行無鎖涉及,在I/O線程內部進行串行操作,避免多線程競爭導致性能下降,從表面看,串行化涉及的CPU利用率似乎不高,併發程度不夠,但是,通過調整NIO線程池的線程參數,可以同時啓動多個串行化的線程並行運行,這種局部無鎖化的串行線程涉及相比“一個隊列對應多個工作線程”模型性能更優。
當用戶線程發起write操作時,netty會判斷,如果發現不是NioEventLoop(I/O線程),則將發送消息封裝成WriteTask任務,放入NioEventLoop的任務隊列,由NioEventLoop線程執行,代碼如果AbstractChannelHandlerContext類:
private void write(Object msg, boolean flush, ChannelPromise promise) {
AbstractChannelHandlerContext next = findContextOutbound();
final Object m = pipeline.touch(msg, next);
EventExecutor executor = next.executor();
if (executor.inEventLoop()) {
if (flush) {
next.invokeWriteAndFlush(m, promise);
} else {
next.invokeWrite(m, promise);
}
} else {
//這裏生成write任務
AbstractWriteTask task;
if (flush) {
task = WriteAndFlushTask.newInstance(next, m, promise);
} else {
task = WriteTask.newInstance(next, m, promise);
}
//依靠NioEventLoop的execute去執行(若是外部線程存儲,會喚醒正在阻塞的selector,如果是第一次被調用,則會啓動一個本地線程做爲nioeventloop的載體)
safeExecute(executor, task, promise, m);
}
}
Netty的NioEventLoop線程內部維護了一個QueuetaskQueue,除了處理網絡I/O讀寫操作,同事還負責執行網絡讀寫相關的Task,代碼如(SingleThreadEventExecutor類)
//這裏只是添加可執行任務
public void execute(Runnable task) {
if (task == null) {
throw new NullPointerException("task");
}
//構建當前線程的EventLoop
boolean inEventLoop = inEventLoop();
//添加任務隊列
addTask(task);
//如果創建不成功,重新啓動
if (!inEventLoop) {
startThread();
if (isShutdown() && removeTask(task)) {
reject();
}
}
if (!addTaskWakesUp && wakesUpForTask(task)) {
wakeup(inEventLoop);
}
}
NioEventLoop遍歷taskQueue,執行消息發送任務,類AbstractWriteTask類
@Override
public final void run() {
try {
// 檢查是否爲null,因爲如果通道已經關閉,它可以被設置爲null
if (ESTIMATE_TASK_SIZE_ON_SUBMIT) {
ctx.pipeline.decrementPendingOutboundBytes(size);
}
write(ctx, msg, promise);
} finally {
// 設置爲null被GC在年輕代回收回收
ctx = null;
msg = null;
promise = null;
handle.recycle(this);
}
}
在處理
public final void write(Object msg, ChannelPromise promise) {
assertEventLoop();
//ChannelOutboundBuffer是保存待發送的數據
ChannelOutboundBuffer outboundBuffer = this.outboundBuffer;
if (outboundBuffer == null) {
// If the outboundBuffer is null we know the channel was closed and so
// need to fail the future right away. If it is not null the handling of the rest
// will be done in flush0()
// See https://github.com/netty/netty/issues/2362
safeSetFailure(promise, WRITE_CLOSED_CHANNEL_EXCEPTION);
// release message now to prevent resource-leak
ReferenceCountUtil.release(msg);
return;
}
int size;
try {
//如果是ByteBuf嘗試把其包裝成directByteBuf,如果是FileRegion直接發送其他的都不會發送
msg = filterOutboundMessage(msg);
//獲得要發送數據的大小
size = pipeline.estimatorHandle().size(msg);
//大小小於0把他置位0
if (size < 0) {
size = 0;
}
} catch (Throwable t) {
safeSetFailure(promise, t);
ReferenceCountUtil.release(msg);
return;
}
//把當前的msg加入outboundBuffer的內部存儲鏈表
outboundBuffer.addMessage(msg, size, promise);
}
經過一些系統處理操作,最終會調用ChannelOutboundBuffer的AddMessage方法,將發送消息加入發送鏈表隊列。通過上邊的分析,可以得出結論:
- 多個業務線程併發調用write相關方法是線程安全的,Netty會將發送消息封裝成Task,由I/O線程異步執行。
- 由於單個Channel由其對應的NioEventLoop線程執行,如果並行調用某個Channel的write操作超時對應的NioEventLoop線程的執行能力會導致WriteTask積壓。
- NioEventLoop線程需要處理網絡讀寫操作,以及註冊到NioEventLoop上的各種Task,兩者相互影響,如果網絡讀寫任務較重,或者註冊的Task太多,都會導致對方延遲執行,引發性能問題。
寫入發送源碼分析
對於ChannelOutboundBuffer可以自行看看。
發送次數限制
當SocketChannel無法一次將所有待發送的ByteBuf/ButeBuffer寫入網絡時,需要決定是註冊WRITE在下一次Selector輪訓時繼續發送,還是在當前位置循環發送,等到所有消息都發送完成再返回。頻繁註冊會影響性能,如果TCP的發送緩存區已滿,TCP處於KEEP-ALIVE狀態,消息無法發送出去,如果不對循環發送次數進行控制,就會長時間處於發送狀態,Reactor線程無法計數讀取其他消息和排隊的task任務,所以netty採取了折中的方式,
如果本次發送的字節數大於0,但是消息尚未發送完,則循環發送,一旦發現write字節數爲0,說明TCP緩衝區已滿,此時繼續發送沒有意義,註冊SelectKey.OP_WRITE並退出循環,在下一個SelectionKey輪訓週期繼續發送。
//NioSocketChannel
protected void doWrite(ChannelOutboundBuffer in) throws Exception {
SocketChannel ch = javaChannel();
int writeSpinCount = config().getWriteSpinCount();
do {
if (in.isEmpty()) {
// All written so clear OP_WRITE
clearOpWrite();
// Directly return here so incompleteWrite(...) is not called.
return;
}
// Ensure the pending writes are made of ByteBufs only.
int maxBytesPerGatheringWrite = ((NioSocketChannelConfig) config).getMaxBytesPerGatheringWrite();
ByteBuffer[] nioBuffers = in.nioBuffers(1024, maxBytesPerGatheringWrite);
//獲取待發送消息的ByteBuffer數
int nioBufferCnt = in.nioBufferCount();
// Always us nioBuffers() to workaround data-corruption.
// See https://github.com/netty/netty/issues/2761
switch (nioBufferCnt) {
case 0:
// We have something else beside ByteBuffers to write so fallback to normal writes.
writeSpinCount -= doWrite0(in);
break;
case 1: {
// Only one ByteBuf so use non-gathering write
// Zero length buffers are not added to nioBuffers by ChannelOutboundBuffer, so there is no need
// to check if the total size of all the buffers is non-zero.
//直接通過nioBuffers[0]獲取待發送消息
ByteBuffer buffer = nioBuffers[0];
int attemptedBytes = buffer.remaining();
//完成消息發送
final int localWrittenBytes = ch.write(buffer);
if (localWrittenBytes <= 0) {
incompleteWrite(true);
return;
}
adjustMaxBytesPerGatheringWrite(attemptedBytes, localWrittenBytes, maxBytesPerGatheringWrite);
in.removeBytes(localWrittenBytes);
--writeSpinCount;
break;
}
default: {
// Zero length buffers are not added to nioBuffers by ChannelOutboundBuffer, so there is no need
// to check if the total size of all the buffers is non-zero.
// We limit the max amount to int above so cast is safe
long attemptedBytes = in.nioBufferSize();
final long localWrittenBytes = ch.write(nioBuffers, 0, nioBufferCnt);
if (localWrittenBytes <= 0) {
incompleteWrite(true);
return;
}
// Casting to int is safe because we limit the total amount of data in the nioBuffers to int above.
adjustMaxBytesPerGatheringWrite((int) attemptedBytes, (int) localWrittenBytes,
maxBytesPerGatheringWrite);
in.removeBytes(localWrittenBytes);
--writeSpinCount;
break;
}
}
//這裏判斷是否寫入TCP緩衝區爲0
} while (writeSpinCount > 0);
incompleteWrite(writeSpinCount < 0);
}
不同消息發送策略
消息發送有三種策略
- 如果待發送消息的ByteBuffer數量等於1,則直接通過nioBuffers[0]獲取待發送消息的ByteBuffer,通過JDK的SocketChannel直接完成消息發送,以上代碼的case 1;
- 如果待發送消息的ByteBuffer數量大於1,則調用SocketChannel的批量發送接口,將nioBuffers數組寫入TCP發送緩衝區;以上代碼default
- 如果待發送的消息包含的JDK原生ByteBuffer數爲0,則調用父類AbstractNioByteChannel的doWrite0方法,將Netty的Bytebuf發送到TCP緩衝區。以上代碼case 0;
已發送消息內存釋放
如果消息發送成功,Netty會釋放已發送消息的內存,發送對象不同,釋放策略也不同,
- 如果發送對象是JDK的ByteBuffer,跟進發送的字節數計算需要被釋放的發送對象的個數,代碼如ChannelOutBoundBuffer
public void removeBytes(long writtenBytes) {
for (;;) {
Object msg = current();
if (!(msg instanceof ByteBuf)) {
assert writtenBytes == 0;
break;
}
final ByteBuf buf = (ByteBuf) msg;
//可讀開始
final int readerIndex = buf.readerIndex();
//可讀字節數
final int readableBytes = buf.writerIndex() - readerIndex;
//發送的字節數大於可讀字節數,當前ByteBuffer已經完全發送出去,
if (readableBytes <= writtenBytes) {
if (writtenBytes != 0) {
//更新ChannelOutboundBuffer的發送進度信息
progress(readableBytes);
//發送減去一條消息的字節數,循環判斷後續的消息,直到所有的消息都被刪除
writtenBytes -= readableBytes;
}
remove();
} else { // readableBytes > writtenBytes
if (writtenBytes != 0) {
buf.readerIndex(readerIndex + (int) writtenBytes);
progress(writtenBytes);
}
break;
}
}
clearNioBuffers();
}
- 發送對象是Netty的ByteBuf,通過判斷當前的ByteBuf的isReadable來獲取消息發送結果,如果發送完成,則調用ChannelOutbounfBuffer的remove方法刪除並釋放ByteBuf,代碼AbstractNioByteChannel類
private int doWriteInternal(ChannelOutboundBuffer in, Object msg) throws Exception {
//判斷是netty的ByteBuf
if (msg instanceof ByteBuf) {
ByteBuf buf = (ByteBuf) msg;
//判斷當前BuyBuf的方法來獲得發送結果,這裏返回處理
if (!buf.isReadable()) {
//刪除釋放
in.remove();
return 0;
}
final int localFlushedAmount = doWriteBytes(buf);
if (localFlushedAmount > 0) {
in.progress(localFlushedAmount);
if (!buf.isReadable()) {
in.remove();
}
return 1;
}
} else if (msg instanceof FileRegion) {
FileRegion region = (FileRegion) msg;
if (region.transferred() >= region.count()) {
in.remove();
return 0;
}
//
long localFlushedAmount = doWriteFileRegion(region);
if (localFlushedAmount > 0) {
in.progress(localFlushedAmount);
if (region.transferred() >= region.count()) {
in.remove();
}
return 1;
}
} else {
// Should not reach here.
throw new Error();
}
return WRITE_STATUS_SNDBUF_FULL;
}
寫半包
如果一次無法將待發送的消息全部寫入TCP緩衝區,緩存writeSpinCount次仍未發送完,或者在發送過程中出現了TCP寫入的字節數爲0,則進入“寫半包”模式,目的是在消息發送慢的時候不要死循環發送,這回阻塞NioEventLoop線程,註冊SelectionKey.OP_WRITE到對應的Selector,對出循環,在下一次Selector輪詢過程中解析執行write操作,上邊代碼NioSocketChannel.write();
//循環執行write操作
final int localWrittenBytes = ch.write(buffer);
if (localWrittenBytes <= 0) {
incompleteWrite(true);
return;
}
// 循環riteSpinCount 代碼省略
--writeSpinCount;
//AbstractNioChannel.setOpWrite 註冊SelectionKey.OP_WRITE相關的操作
protected final void setOpWrite() {
final SelectionKey key = selectionKey();
// Check first if the key is still valid as it may be canceled as part of the deregistration
// from the EventLoop
// See https://github.com/netty/netty/issues/2104
if (!key.isValid()) {
return;
}
final int interestOps = key.interestOps();
if ((interestOps & SelectionKey.OP_WRITE) == 0) {
key.interestOps(interestOps | SelectionKey.OP_WRITE);
}
}
消息發送高水位控制
爲了對發送速度和消息積壓數進行控制,Netty提供了高低水位機制,當消息隊列中積壓的待發送消息總字節數達到了高水位時,修改Channel的狀態爲不可寫。
具體代碼在ChannelOutboundBuffer.incrementPendingOutboundBytes
private void incrementPendingOutboundBytes(long size, boolean invokeLater) {
if (size == 0) {
return;
}
long newWriteBufferSize = TOTAL_PENDING_SIZE_UPDATER.addAndGet(this, size);
//這裏我們之前的代碼設置過
if (newWriteBufferSize > channel.config().getWriteBufferHighWaterMark()) {
//設置爲不可寫
setUnwritable(invokeLater);
}
}
修改Channel狀態後,調用ChannelPipeline發送通知事件,業務可以監聽該事件及獲取鏈路可寫狀態,代碼ChannelOutboundBuffer.fireChannelWritabilityChanged
private void fireChannelWritabilityChanged(boolean invokeLater) {
final ChannelPipeline pipeline = channel.pipeline();
if (invokeLater) {
Runnable task = fireChannelWritabilityChangedTask;
if (task == null) {
fireChannelWritabilityChangedTask = task = new Runnable() {
@Override
public void run() {
//同步可寫狀態
pipeline.fireChannelWritabilityChanged();
}
};
}
channel.eventLoop().execute(task);
} else {
//同步可寫狀態
pipeline.fireChannelWritabilityChanged();
}
}
消息發送完成後,對低水位進行判斷,如果當前積壓的待發送字節數達到或者低於低水位,則修改Channel狀態爲可寫,併發通知事件,代碼ChannelOutboundBuffer.decrementPendingOutboundBytes
private void decrementPendingOutboundBytes(long size, boolean invokeLater, boolean notifyWritability) {
if (size == 0) {
return;
}
long newWriteBufferSize = TOTAL_PENDING_SIZE_UPDATER.addAndGet(this, -size);
//當前積壓的待發送字節數達到或者低於低水位,則修改Channel狀態爲可寫
if (notifyWritability && newWriteBufferSize < channel.config().getWriteBufferLowWaterMark()) {
setWritable(invokeLater);
}
}
利用Netty的高低水位機制,可以防止在發送隊列處於高水位時繼續發送消息導致積壓,甚至發生內存泄漏,在業務中合理利用Netty的高水位機制,可以提升系統的可靠性。