本源码使用的Kafka Client是0.10.0.1
NetworkClient是一个通用的网络客户端实现,Kafka生产者和消费者都使用NetworkClient组件和服务端Broker之间进行通讯。
public class NetworkClient implements KafkaClient {
private static final Logger log = LoggerFactory.getLogger(NetworkClient.class);
/* the selector used to perform network i/o */
//网络I/O,发送和接受消息
private final Selectable selector;
......
}
NetworkClient中负责网络I/O的是Selectable selector接口,接下来主要分析下Selectable接口的实现Selector类。
Selector类
Selector类(在org.apache.kafka.common.network包下),Selector底层封装了Java NIO,使用一个单独的线程可以管理多条网络连接上的链接、读、写等操作。该类的核心属性如下:
核心属性及作用
//java.nio.channels.Selector类型,用来监听网络I/O事件。
private final java.nio.channels.Selector nioSelector;
//维护了NodeId与KafkaChannel之间的映射关系,表示生产者客户端与各个Node之间的网络链接。
//KafkaChannel是在SocketChannel上的又一层封装。其中Send和NetworkReceive分别表示读和写时用的缓存,此等通过ByteBuffer实现,
//TransportLayer封装了SocketChannel及SelectionKey,TransportLayer根据网络协议的不同,提供不同的子类,而对KafkaChannel提供统一的接口
private final Map<String, KafkaChannel> channels;
//记录已经完全发送出去的请求
private final List<Send> completedSends;
//记录已经完全接受到的请求
private final List<NetworkReceive> completedReceives;
//记录从连接中读取到的消息
//暂停一次OP_READ事件处理完成之后,会将stagedReceives集合中的请求保存到completeReceives集合中
private final Map<KafkaChannel, Deque<NetworkReceive>> stagedReceives;
//记录刚刚创建的连接SelectionKey,因为是异步的,所以不知道该连接是否连接完成
private final Set<SelectionKey> immediatelyConnectedKeys;
//记录一次poll过程中发现的断开链接
private final List<String> disconnected;
//记录一次poll过程中新建立的连接
private final List<String> connected;
//记录向哪些Node发送的请求失败了
private final List<String> failedSends;
//用于创建KafkaChannel的Builder。根据不同配置创建不同的TransportLayer的子类,然后创建KafkaChannel。
private final ChannelBuilder channelBuilder;
//LinkedHashMap类型,用来记录各个链接的使用情况,并根据此关闭空闲时间超过connectionsMaxIdleNanos的链接
private final Map<String, Long> lruConnections;
//连接最大空闲的时间,单位:ns
private final long connectionsMaxIdleNanos;
//最大接受的消息大小
private final int maxReceiveSize;
接下来主要看一下Selector类中的常用方法
构造器
public Selector(int maxReceiveSize, long connectionMaxIdleMs, Metrics metrics, Time time, String metricGrpPrefix, Map<String, String> metricTags, boolean metricsPerConnection, ChannelBuilder channelBuilder) {
try {
this.nioSelector = java.nio.channels.Selector.open(); //创建一个新的nioSelector
} catch (IOException e) {
throw new KafkaException(e);
}
this.maxReceiveSize = maxReceiveSize;
this.connectionsMaxIdleNanos = connectionMaxIdleMs * 1000 * 1000;
this.time = time;
this.metricGrpPrefix = metricGrpPrefix;
this.metricTags = metricTags;
this.channels = new HashMap<>();
this.completedSends = new ArrayList<>();
this.completedReceives = new ArrayList<>();
this.stagedReceives = new HashMap<>();
this.immediatelyConnectedKeys = new HashSet<>();
this.connected = new ArrayList<>();
this.disconnected = new ArrayList<>();
this.failedSends = new ArrayList<>();
this.sensors = new SelectorMetrics(metrics);
this.channelBuilder = channelBuilder;
// initial capacity and load factor are default, we set them explicitly because we want to set accessOrder = true
this.lruConnections = new LinkedHashMap<>(16, .75F, true);
currentTimeNanos = time.nanoseconds();
nextIdleCloseCheckTime = currentTimeNanos + connectionsMaxIdleNanos;
this.metricsPerConnection = metricsPerConnection;
}
构造器中主要对nioSelector和一些属性进行初始化操作
connect方法
public void connect(String id, InetSocketAddress address, int sendBufferSize, int receiveBufferSize) throws IOException {
if (this.channels.containsKey(id)) //已经包含该node的id,则抛出异常
throw new IllegalStateException("There is already a connection for id " + id);
//创建SocketChannel
SocketChannel socketChannel = SocketChannel.open();
socketChannel.configureBlocking(false); //配置成非阻塞模式
Socket socket = socketChannel.socket();
socket.setKeepAlive(true); //设置为长连接
if (sendBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)
socket.setSendBufferSize(sendBufferSize); //设置SO_SNDBUF大小
if (receiveBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)
socket.setReceiveBufferSize(receiveBufferSize);//设置SO_RCVBUF大小
socket.setTcpNoDelay(true); //Tcp无延迟
boolean connected;
try {
/*
* 因为是非阻塞方式,所以SocketChannel.connect()方法是发起一个连接,
* connect方法在连接正式建立之前就可能返回,在后面会通过Selector.finishConnect()方法确认连接
* 是否真正建立了。
* */
connected = socketChannel.connect(address); //发起连接到kafka服务端
} catch (UnresolvedAddressException e) {
socketChannel.close();
throw new IOException("Can't resolve address: " + address, e);
} catch (IOException e) {
socketChannel.close();
throw e;
}
//将这个socketChannel注册到nioSelector上,并关注OP_CONNECT事件
SelectionKey key = socketChannel.register(nioSelector, SelectionKey.OP_CONNECT);
//创建KafkaChannel
KafkaChannel channel = channelBuilder.buildChannel(id, key, maxReceiveSize);
key.attach(channel); //将KafkaChannel注册到key上
this.channels.put(id, channel);//将NodeId和KafkaChannel绑定,放到channels中管理
if (connected) {
// OP_CONNECT won't trigger for immediately connected channels
log.debug("Immediately connected to node {}", channel.id());
immediatelyConnectedKeys.add(key);
key.interestOps(0);
}
}
创建SocketChannel并配置成非阻塞模式,将SocketChannel关联的Socket设置成长连接,SocketChannel连接到远程地址,并将SocketChannel注册到nio的Selector对象上,注册的事件为SelectionKey.OP_CONNECT,并返回SelectionKey key
创建KafkaChannel对象,并将该对象附加(attach)到key上,后续有事件发生时会获取该KafkaChannel对象。KafkaChannel中维护了当前key,用于处理最终的读写
由于java NIO是异步的,所以无法知道该连接是否完成,所以先将该key添加到immediatelyConnectedKeys中,稍后处理。
select方法
private int select(long ms) throws IOException {
if (ms < 0L)
throw new IllegalArgumentException("timeout should be >= 0");
if (ms == 0L)
return this.nioSelector.selectNow();
else
return this.nioSelector.select(ms);
}
我看到select方法主要调用java NIO的Selector对象获取就绪的事件,
selectNow()不会阻塞,如果没有事件就绪也会直接返回
select(ms) 会阻塞直到事件产生或者超时
poll方法
public void poll(long timeout) throws IOException {
if (timeout < 0)
throw new IllegalArgumentException("timeout should be >= 0");
clear(); //将上一次poll()方法的结果全部清除掉
if (hasStagedReceives() || !immediatelyConnectedKeys.isEmpty())
timeout = 0;
/* check ready keys */
long startSelect = time.nanoseconds();
//调用nioSelector.select()方法,等待I/O事件发送
int readyKeys = select(timeout);
long endSelect = time.nanoseconds();
currentTimeNanos = endSelect;
this.sensors.selectTime.record(endSelect - startSelect, time.milliseconds()); //统计select阻塞的时间
//有事件发生 或者 immediatelyConnectedKeys集合不为空
if (readyKeys > 0 || !immediatelyConnectedKeys.isEmpty()) {
//处理I/O事件
pollSelectionKeys(this.nioSelector.selectedKeys(), false);
pollSelectionKeys(immediatelyConnectedKeys, true);
}
//将stagedReceives复制到completedReceives集合中
addToCompletedReceives();
long endIo = time.nanoseconds();
this.sensors.ioTime.record(endIo - endSelect, time.milliseconds());
maybeCloseOldestConnection(); //关闭长期空闲的连接
}
首先将上一次poll()方法的结果全部清除掉,并获取select的参数timeout,即最长阻塞时间。调用java nio Selector的select方法获取就绪的SelectionKey,即readyKeys。
如果有事件发生或immediatelyConnectedKeys集合(刚刚创建连接,还不确定是否连接完成的SelectionKey)不为空,则调用pollSelectionKeys方法进行处理。
pollSelectionKeys方法
pollSelectionKeys方法主要处理已经就绪的Key 和 immediatelyConnectedKeys集合不为空的情况
private void pollSelectionKeys(Iterable<SelectionKey> selectionKeys, boolean isImmediatelyConnected) {
Iterator<SelectionKey> iterator = selectionKeys.iterator();
while (iterator.hasNext()) {
SelectionKey key = iterator.next();
iterator.remove();
//之前创建连接时,将KafkaChannel注册到key上,就是为了在这里获取
KafkaChannel channel = channel(key);
// register all per-connection metrics at once
sensors.maybeRegisterConnectionMetrics(channel.id());
lruConnections.put(channel.id(), currentTimeNanos); //更新lru信息
try {
/* complete any connections that have finished their handshake (either normally or immediately) */
//对connect方法返回true或OP_CONNECTION事件的处理
if (isImmediatelyConnected || key.isConnectable()) {
//finishConnect方法会先检测socktChannel是否建立完成,完成后,
//会取消OP_CONNECT事件关注,开始关注OP_READ事件
if (channel.finishConnect()) {
this.connected.add(channel.id());//添加到"已连接"的集合中
this.sensors.connectionCreated.record();
} else
continue; //连接未完成,则跳过对此Channel的后续处理
}
/* if channel is not ready finish prepare */
//调用KafkaChannel.prepare()方法进行身份验证,
if (channel.isConnected() && !channel.ready())
channel.prepare();
/* if channel is ready read from any connections that have readable data */
if (channel.ready() && key.isReadable() && !hasStagedReceive(channel)) {
//OP_READ事件处理
NetworkReceive networkReceive;
while ((networkReceive = channel.read()) != null)
/*
* 上面channel.read()读取到一个完整的NetworkReceive,则将其添加到stagedReceives中保存
* 若读取不到一个完整的NetworkReceive,则返回null,下次处理OP_READ事件时,继续读取,
* 直至读取到一个完整的NetworkReceive
* 将读取到的消息记录到stagedReceives中
* */
addToStagedReceives(channel, networkReceive);
}
/* if channel is ready write to any sockets that have space in their buffer and for which we have data */
if (channel.ready() && key.isWritable()) { //OP_WRITE事件处理
//上面的channel.write()方法将KafkaChannel.send字段发送出去,如果未完成发送,则返回null
//如果发送完成,则返回send,并添加到completeSends集合中,待后续处理
Send send = channel.write();
if (send != null) {
this.completedSends.add(send); //添加到completedSends集合
this.sensors.recordBytesSent(channel.id(), send.size());
}
}
/*
* completedSends和completedReceives分别表示在Selector端已经发送的和接受到的请求,它们会在NetworkClient的poll调用之后被不同的
* handleCompleteXXX()方法处理
* */
/* cancel any defunct sockets */
if (!key.isValid()) {
close(channel);
this.disconnected.add(channel.id());
}
/*
* 通过isValid()的返回值以及执行过程中是否抛出异常来判断连接的状态,
* 并将断开的连接收集到disconnected集合,并在后续操作中进行重连。
* */
} catch (Exception e) {
//抛出异常,则任务连接关闭,将对应NodeId添加到disconnected集合
String desc = channel.socketDescription();
if (e instanceof IOException)
log.debug("Connection with {} disconnected", desc, e);
else
log.warn("Unexpected error from {}; closing connection", desc, e);
close(channel);
this.disconnected.add(channel.id());
}
}
}
遍历已经就绪的SelectionKey,首先从集合中删除当前key。调用channel方法获取当前key对应的KafkaChannel。
private KafkaChannel channel(SelectionKey key) {
return (KafkaChannel) key.attachment();
}
解析来分别处理OP_CONNECTION、OP_READ和OP_WRITE事件
处理OP_ CONNECTION事件
调用KafkaChannel的finishConnect方法判断连接是否完成
finishConnect方法底层调用KafkaChannel通讯层TransportLayer的finishConnect进行判断,我们看TransportLayer其中一个实现类PlaintextTransportLayer的finishConnect方法
public boolean finishConnect() throws IOException {
boolean connected = socketChannel.finishConnect();
if (connected)
key.interestOps(key.interestOps() & ~SelectionKey.OP_CONNECT | SelectionKey.OP_READ);
return connected;
}
KafkaChannel的通讯层TransportLayer封装了SocketChannel通道,首先调用socketChannel的finishConnect()判断当前SocketChannel对应的Socket连接是否完成,如果已完成,会取消OP_CONNECT事件关注,开始关注OP_READ事件。
如果已完成则将该KafkaChannel对应的Node Id添加到属性connected集合中
处理OP_READ事件
将读取到的完整消息,写入到stagedReceives
处理OP_WRITE事件
如果有缓存的等待发送的消息,则发送缓存的消息。待发送的消息缓存在KafkaChannel的Send send字段中。如果消息发送成功,则将已发送的消息添加到completedSends集合。
send方法
send方法主要对外提供发送消息的功能
//并没有网络I/O
public void send(Send send) {
KafkaChannel channel = channelOrFail(send.destination());
try {
channel.setSend(send);
} catch (CancelledKeyException e) {
this.failedSends.add(send.destination());
close(channel);
}
}
将之前创建的RequestSend对象缓存到KafkaChannel的send字段中,并开始关注此连接的OP_WRITE事件。
wakeup方法
唤醒阻塞在I/O的线程
public void wakeup() {
this.nioSelector.wakeup();
}
其他方法就不在累述,以上是Selector中主要使用Java NIO进行处理的方法。当然KafkaChannel中的TransportLayer对象中存储了SocketChannel socketChannel,以上的pollSelectionKeys方法处理读写事件都是调用该socketChannel进行读写,有兴趣可自行阅读。