內部類 |
作用 |
Call |
存儲客戶端發來的請求 |
Listener |
監聽類: 監聽客戶端發來的請求,內部靜態類Listener.Reader: 當監聽器監聽到用戶請求,便讓Reader讀取用戶請求 |
Responder |
響應RPC請求類,請求處理完畢,由Responder發送給請求客戶端 |
Connection |
連接類,真正的客戶端請求讀取邏輯在這個類中 |
Handler |
請求處理類,會循環阻塞讀取callQueue中的call對象,並對其進行操作 |
Server是個抽象類, 唯一抽象的方法. Server提供了一個架子, Server的具體功能, 需要具體類來完成。而具體類, 當然就是實現call方法。
/** Called for each call. */
public abstract Writable call(Class<?> protocol, Writable param, long receiveTime) throws IOException;
Server.Call 和Client.Call類似, Server.Call包含了一次請求, 其中id和param的含義和Client.Call是一致的。不同點:
connection是該Call來自的連接, 當請求處理結束時, 相應的結果會通過相同的connection, 發送給客戶端。
timestamp是請求到達的時間戳, 如果請求很長時間沒被處理, 對應的連接會被關閉, 客戶端也就知道出錯了。
response是請求處理的結果, 可能是一個Writable的串行化結果, 也可能一個異常的串行化結果。
Server.Connection 維護了一個來自客戶端的socket連接。
它處理版本校驗, 讀取請求並把請求發送到請求處理線程, 接收處理結果並把結果發送給客戶端。
Hadoop的Server採用了Java的NIO, 這樣的話就不需要爲每一個socket連接建立一個線程, 讀取socket上的數據。
在Server中, 只需要一個線程, 就可以accept新的連接請求和讀取socket上的數據, 這個線程, 就是Listener。
Server.Handler 請求處理線程一般有多個.
Handler的run方法循環地取出一個Server.Call, 調用Server.call方法, 蒐集結果並串行化, 然後將結果放入Responder隊列中。
對於處理完的請求, 需要將結果寫回去, 同樣, 利用NIO, 只需要一個線程, 相關的邏輯在Responder裏。
Call
/** A call queued for handling. */
private static class Call {
private int id; // the client's call id 客戶端的RPC調用對象Call的id
private Writable param; // the parameter passed 客戶端的PRC調用對象Call的參數
private Connection connection;// connection to client 客戶端連接對象,在服務端持有這個對象, 就能知道是哪個客戶端連接
private long timestamp; // the time received when response is null; the time served when response is not null
private ByteBuffer response; // the response for this call 當前RPC調用的響應
public Call(int id, Writable param, Connection connection) {
this.id = id;
this.param = param;
this.connection = connection;
this.timestamp = System.currentTimeMillis();
this.response = null;
}
public void setResponse(ByteBuffer response) {
this.response = response;
}
}
Server initialize and start
ipc.Server是抽象類, 抽象類不能實例化, 那麼系統啓動的時候, 實例化的是ipc.Server抽象類的實現類, 即ipc.RPC.Server. 即啓動RPC.Server
public class NameNode implements ClientProtocol, DatanodeProtocol, NamenodeProtocol, FSConstants,
RefreshAuthorizationPolicyProtocol, RefreshUserMappingsProtocol {
/** RPC server */
private Server server;
/** RPC server for HDFS Services communication.
* BackupNode, Datanodes and all other services should be connecting to this server if it is configured. Clients should only go to NameNode#server */
private Server serviceRpcServer;
/** RPC server address */
private InetSocketAddress serverAddress = null;
/** RPC server for DN address */
protected InetSocketAddress serviceRPCAddress = null;
/** Initialize name-node. NameNode初始化 */
private void initialize(Configuration conf) throws IOException {
InetSocketAddress socAddr = NameNode.getAddress(conf);
// create rpc server
InetSocketAddress dnSocketAddr = getServiceRpcServerAddress(conf);
if (dnSocketAddr != null) {
this.serviceRpcServer = RPC.getServer(this, dnSocketAddr.getHostName(), dnSocketAddr.getPort(),
serviceHandlerCount, false, conf, namesystem.getDelegationTokenSecretManager());
this.serviceRPCAddress = this.serviceRpcServer.getListenerAddress();
setRpcServiceServerAddress(conf);
}
this.server = RPC.getServer(this, socAddr.getHostName(), socAddr.getPort(), handlerCount, false, conf,
namesystem.getDelegationTokenSecretManager());
// The rpc-server port can be ephemeral... ensure we have the correct info
this.serverAddress = this.server.getListenerAddress();
startHttpServer(conf);
this.server.start(); //start RPC server
if (serviceRpcServer != null) {
serviceRpcServer.start();
}
startTrashEmptier(conf);
}
}
RPC.newServer()
/** Construct a server for a protocol implementation instance listening on a port and address, with a secret manager.
* 構造協議實現類的服務器, 該方法的調用者NameNode實現了一系列的協議接口.
* 以讀取HDFS文件爲例, 客戶端發送一個RPC調用getBlockLocations()到NameNode服務端, 由NameNode進行實際的方法調用
*/
public static Server getServer(final Object instance, final String bindAddress, final int port, final int numHandlers,
final boolean verbose, Configuration conf, SecretManager<? extends TokenIdentifier> secretManager) {
return new Server(instance, conf, bindAddress, port, numHandlers, verbose, secretManager);
}
/** An RPC Server. */
public static class Server extends org.apache.hadoop.ipc.Server {
/** Construct an RPC server.
* @param instance the instance whose methods will be called 被調用的方法的實例對象
* @param conf the configuration to use
* @param bindAddress the address to bind on to listen for connection
* @param port the port to listen for connections on
* @param numHandlers the number of method handler threads to run
* @param verbose whether each call should be logged
*/
public Server(Object instance, Configuration conf, String bindAddress, int port, int numHandlers, boolean verbose,
SecretManager<? extends TokenIdentifier> secretManager) {
super(bindAddress, port, Invocation.class, numHandlers, conf,
classNameBase(instance.getClass().getName()), secretManager); // 調用父類ipc.Server抽象類的構造方法
this.instance = instance;
this.verbose = verbose;
}
}
ipc.Server構造方法
/** An abstract IPC service. IPC calls take a single Writable as a parameter, and return a Writable as their value.
* A service runs on a port and is defined by a parameter class and a value class.
*/
public abstract class Server {
private String bindAddress;
private int port; // port we listen on
private int handlerCount; // number of handler threads
private int readThreads; // number of read threads
private Class<? extends Writable> paramClass; // class of call parameters
private int maxIdleTime; // the maximum idle time after which a client may be disconnected
private int thresholdIdleConnections; // the number of idle connections after which we will start cleaning up idle connections
int maxConnectionsToNuke; // the max number of connections to nuke during a cleanup
protected RpcInstrumentation rpcMetrics;
private Configuration conf;
private SecretManager<TokenIdentifier> secretManager;
private int maxQueueSize;
private final int maxRespSize;
private int socketSendBufferSize;
private final boolean tcpNoDelay; // if T then disable Nagle's Algorithm
volatile private boolean running = true;// true while server runs
private BlockingQueue<Call> callQueue;// queued calls
private List<Connection> connectionList = Collections.synchronizedList(new LinkedList<Connection>()); //maintain a list of client connections
private Listener listener = null;//服務端監聽器
private Responder responder = null;//服務端寫回客戶端的響應
private int numConnections = 0;//連接的客戶端個數
private Handler[] handlers = null;//處理類
/** Constructs a server listening on the named port and address.
* Parameters passed must be of the named class.
* The handlerCount determines the number of handler threads that will be used to process calls. */
protected Server(String bindAddress, int port, Class<? extends Writable> paramClass, int handlerCount,
Configuration conf, String serverName, SecretManager<? extends TokenIdentifier> secretManager) {
this.bindAddress = bindAddress;
this.conf = conf;
this.port = port;
this.paramClass = paramClass;
this.handlerCount = handlerCount;
this.socketSendBufferSize = 0;
this.maxQueueSize = handlerCount * conf.getInt(IPC_SERVER_HANDLER_QUEUE_SIZE_KEY, IPC_SERVER_HANDLER_QUEUE_SIZE_DEFAULT);
this.maxRespSize = conf.getInt(IPC_SERVER_RPC_MAX_RESPONSE_SIZE_KEY, IPC_SERVER_RPC_MAX_RESPONSE_SIZE_DEFAULT);
this.readThreads = conf.getInt(IPC_SERVER_RPC_READ_THREADS_KEY, IPC_SERVER_RPC_READ_THREADS_DEFAULT);
this.callQueue = new LinkedBlockingQueue<Call>(maxQueueSize);
this.maxIdleTime = 2*conf.getInt("ipc.client.connection.maxidletime", 1000);
this.maxConnectionsToNuke = conf.getInt("ipc.client.kill.max", 10);
this.thresholdIdleConnections = conf.getInt("ipc.client.idlethreshold", 4000);
this.secretManager = (SecretManager<TokenIdentifier>) secretManager;
this.authorize = conf.getBoolean(HADOOP_SECURITY_AUTHORIZATION, false);
this.isSecurityEnabled = UserGroupInformation.isSecurityEnabled();
// Start the listener here and let it bind to the port
listener = new Listener();
this.port = listener.getAddress().getPort();
this.rpcMetrics = RpcInstrumentation.create(serverName, this.port);
this.tcpNoDelay = conf.getBoolean("ipc.server.tcpnodelay", false);
responder = new Responder(); // Create the responder here
if (isSecurityEnabled) {
SaslRpcServer.init(conf);
}
}
private void closeConnection(Connection connection) {
synchronized (connectionList) {
if (connectionList.remove(connection))
numConnections--;
}
try {
connection.close();
} catch (IOException e) {
}
}
// NameNode在獲得Server後, 會調用server.start()啓動服務端. 三個對象responder,listener,handlers都是線程類, 都調用start()
/** Starts the service. Must be called before any calls will be handled. */
public synchronized void start() {
responder.start();
listener.start();
handlers = new Handler[handlerCount];
for (int i = 0; i < handlerCount; i++) {
handlers[i] = new Handler(i);
handlers[i].start();
}
}
}
RPC.Server.Listener
Client端的底層通信直接採用了阻塞式IO編程,Server端採用Listener監聽客戶端的連接
private static final ThreadLocal<Server> SERVER = new ThreadLocal<Server>();
/** Returns the server instance called under or null. May be called under #call(Writable, long)implementations,
* and under Writable methods of paramters and return values. Permits applications to access the server context.*/
public static Server get() {
return SERVER.get();
}
//maintain a list of client connections 維護客戶端的連接列表, 這裏的Connection是Server.Connection
private List<Connection> connectionList = Collections.synchronizedList(new LinkedList<Connection>());
/** Listens on the socket. Creates jobs for the handler threads 監聽客戶端Socket連接, 爲handler線程創建任務 */
private class Listener extends Thread {
private ServerSocketChannel acceptChannel = null; //the accept channel 服務端通道
private Selector selector = null; //the selector that we use for the server 選擇器(NIO)
private Reader[] readers = null;
private int currentReader = 0;
private InetSocketAddress address; //the address we bind at 服務端地址
private Random rand = new Random();
private long lastCleanupRunTime = 0; //the last time when a cleanup connection (for idle connections) ran
private long cleanupInterval = 10000; //the minimum interval between two cleanup runs
private int backlogLength = conf.getInt("ipc.server.listen.queue.size", 128);
private ExecutorService readPool; //讀取池, 任務執行服務(併發)
public Listener() throws IOException {
address = new InetSocketAddress(bindAddress, port);
acceptChannel = ServerSocketChannel.open(); // Create a new server socket 創建服務端Socket連接,
acceptChannel.configureBlocking(false); // and set to non blocking mode設置爲非阻塞模式
bind(acceptChannel.socket(), address, backlogLength); // Bind the server socket to the local host and port將ServerSocket綁定到本地端口
port = acceptChannel.socket().getLocalPort(); // Could be an ephemeral port
selector= Selector.open(); // create a selector 創建一個監聽器的Selector
readers = new Reader[readThreads];//讀取線程數組
readPool = Executors.newFixedThreadPool(readThreads);//啓動多個reader線程,爲了防止請求多時服務端響應延時的問題
for (int i = 0; i < readThreads; i++) {
Selector readSelector = Selector.open();//每個讀取線程都創建一個Selector
Reader reader = new Reader(readSelector);
readers[i] = reader;
readPool.execute(reader);
}
acceptChannel.register(selector, SelectionKey.OP_ACCEPT); // ①Register accepts on the server socket with the selector. 註冊連接事件
this.setName("IPC Server listener on " + port);
this.setDaemon(true);
}
// 在啓動Listener線程時listener.start(), 服務端會一直等待客戶端的連接
public void run() {
SERVER.set(Server.this); //使用ThreadLocal本地線程,設置當前Server爲當前的ThreadLocal對象
while (running) {
SelectionKey key = null;
try {
selector.select();
Iterator<SelectionKey> iter = selector.selectedKeys().iterator();
while (iter.hasNext()) {
key = iter.next();
iter.remove();
if (key.isValid()) {
if (key.isAcceptable())
doAccept(key); // ②建立連接,服務端接受客戶端連接
}
key = null;
}
} catch (Exception e) {
closeCurrentConnection(key, e);
}
cleanupConnections(false);
}
// 監聽器不再監聽客戶端的連接,關閉通道和選擇器和所有的連接對象
synchronized (this) {
acceptChannel.close();
selector.close();
selector= null;
acceptChannel= null;
while (!connectionList.isEmpty()) { // clean up all connections
closeConnection(connectionList.remove(0));
}
}
}
void doAccept(SelectionKey key) throws IOException, OutOfMemoryError { //②
Connection c = null;
ServerSocketChannel server = (ServerSocketChannel) key.channel();
SocketChannel channel;
while ((channel = server.accept()) != null) {//建立連接server.accept()
channel.configureBlocking(false);
channel.socket().setTcpNoDelay(tcpNoDelay);
Reader reader = getReader(); //從readers池中獲得一個reader
try {
reader.startAdd();//激活readSelector,設置adding爲true
SelectionKey readKey = reader.registerChannel(channel); //③將讀事件設置成興趣事件
c = new Connection(readKey, channel, System.currentTimeMillis());//創建一個連接對象
readKey.attach(c);//將connection對象注入readKey
synchronized (connectionList) {
connectionList.add(numConnections, c);
numConnections++;
}
} finally {
reader.finishAdd(); //設置adding爲false,採用notify()喚醒一個reader, 初始化Listener時啓動的每個reader都使用了wait()方法等待
} // 當reader被喚醒, reader會執行doRead()
}
}
void doRead(SelectionKey key) throws InterruptedException { //④
int count = 0;
Connection c = (Connection)key.attachment();
if (c == null) {
return;
}
c.setLastContact(System.currentTimeMillis());
try {
count = c.readAndProcess();
} catch (InterruptedException ieo) {
throw ieo;
} catch (Exception e) {
count = -1; //so that the (count < 0) block is executed
}
if (count < 0) {
closeConnection(c);
c = null;
} else {
c.setLastContact(System.currentTimeMillis());
}
}
synchronized void doStop() {
if (selector != null) {
selector.wakeup();
Thread.yield();
}
if (acceptChannel != null) {
try {
acceptChannel.socket().close();
} catch (IOException e) {
LOG.info(getName() + ":Exception in closing listener socket. " + e);
}
}
readPool.shutdown();
}
// The method that will return the next reader to work with
// Simplistic implementation of round robin for now
Reader getReader() {
currentReader = (currentReader + 1) % readers.length;
return readers[currentReader];
}
}
Listener.Reader
private class Reader implements Runnable {
private volatile boolean adding = false; //讀取線程是否正在添加中,如果是,等待一秒鐘
private Selector readSelector = null; //讀取線程的Selector選擇器
Reader(Selector readSelector) {
this.readSelector = readSelector;
}
public void run() {
synchronized (this) {
while (running) {
SelectionKey key = null;
readSelector.select();
while (adding) {
this.wait(1000);
}
Iterator<SelectionKey> iter = readSelector.selectedKeys().iterator();
while (iter.hasNext()) {
key = iter.next();
iter.remove();
if (key.isValid()) {
if (key.isReadable()) {
doRead(key); //④
}
}
key = null;
}
}
}
}
/**
* This gets reader into the state that waits for the new channel to be registered with readSelector.
* If it was waiting in select() the thread will be woken up, otherwise whenever select() is called
* it will return even if there is nothing to read and wait in while(adding) for finishAdd call
*/
public void startAdd() {
adding = true;
readSelector.wakeup();
}
public synchronized SelectionKey registerChannel(SocketChannel channel) {
return channel.register(readSelector, SelectionKey.OP_READ); //③
}
public synchronized void finishAdd() {
adding = false;
this.notify();
}
}
NIO通信流程
①初始化服務器時,創建監聽器, 在監聽器的構造方法裏會創建ServerSocketChannel, 選擇器, 以及多個讀取線程. 並在服務端通道上註冊OP_ACCEPT操作
acceptChannel.register(selector, SelectionKey.OP_ACCEPT);
啓動服務器會調用listener.start(), listener是個線程類, 會調用run().
監聽器會一直監聽客戶端的請求, 通過監聽器的Selector選擇器進行輪詢是否有感興趣的事件發生(服務器感興趣的是上面註冊的接受連接事件)
②當客戶端連接服務端, 被Selector捕獲到該事件, 因爲在ServerSocketChannle對OP_ACCEPT操作感興趣,所以服務端接受了客戶端的連接請求.
doAccept(SelectionKey key)
客戶端連接到服務器, 服務端接受連接, 監聽器會從讀取線程池中選擇一個讀取線程, 委託給讀取線程處理, 而不是監聽器自己來處理.
建立SocketChannel連接, 注意不是ServerSocketChannel. (ServerSocketChannle在整個通信過程中只建立一次即服務端啓動的時候)
SocketChannel channel = server.accept();
③往建立的SocketChannel通道註冊感興趣的OP_READ操作. 此時接收讀取事件的選擇器不再是監聽器的, 而是讀取線程的選擇器
SelectionKey readKey = reader.registerChannel(channel);
channel.register(readSelector, SelectionKey.OP_READ); //往readSelector註冊感興趣的OP_READ操作.讀取線程的選擇器負責輪詢監聽客戶端的數據寫入
同時根據(readKey和SocketChannel,當前時間)建立一個Connection注入到readKey中.
這個Connection對象是客戶端和服務器的連接對象, 客戶端和服務器建立連接後, 在後續的客戶端寫入數據過程也應該使用同一個Connection
Connection c = new Connection(readKey, channel, System.currentTimeMillis());
readKey.attach(c); //在通信過程中如果想要保存某個對象,附加在selectionKey中
註冊讀操作後,服務端的監聽器的讀取線程就能讀取客戶端傳入的數據
④客戶端開始向服務端寫入數據, 讀取線程Reader的選擇器捕獲到客戶端的寫入事件,
因爲讀取線程註冊了感興趣的OP_READ操作,所以能夠讀取客戶端的寫入數據.
doRead(SelectionKey key)
讀取事件的操作會根據selectionKey獲得Connection, 這個Connection對象正是客戶端和服務器建立連接時注入到readKey中的Connection對象
具體的讀取客戶端的數據的操作就在該Connection的readAndProcess方法裏
connection.readAndProcess();
Listener.Reader的線程模型
Listener的doAccept()接受連接過程
從readers池中獲得一個reader線程
reader.startAdd(); 激活readSelector,設置adding爲true --> 讀線程監聽客戶端的數據寫入,如果adding=true,表示Reader正在添加,再等待一秒鐘
將讀事件設置成興趣事件
創建一個連接對象
reader.finishAdd(); 設置adding爲false,採用notify()喚醒一個reader, 初始化Listener時啓動的每個reader都使用了wait()方法等待
將要設置讀事件爲興趣事件包裝在設置Reader的adding屬性以及使用notify()兩者之間. 是爲了確保讀取線程發生在設置讀事件爲感興趣事件之後.
基於NIO的事件模型採用選擇器來輪詢感興趣的事件.只要有感興趣的操作, 選擇器就會捕獲進行處理. 如果沒有感興趣的事件發生則沒有操作.
所以服務端接受客戶端的連接和讀取客戶端的數據這兩個操作過程發生的時刻完全是隨機的.
也就是說監聽客戶端連接的選擇器和多個讀取客戶端數據的讀取線程的選擇器捕獲事件也都是隨機的.
但是讀取客戶端的數據必須保證發生在客戶端連接服務器之後.
因爲如果客戶端沒有連接服務器, 也就不會註冊讀取事件OP_READ到讀取線程上. 因爲註冊OP_READ發生在在doAccept()客戶端連接服務器操作中.
初始化Listener時啓動的每個Reader, 都會新建對應的選擇器. Reader的默認字段adding=false
Selector readSelector = Selector.open(); //每個讀取線程都創建一個Selector
Reader reader = new Reader(readSelector);
初始化時儘管adding=false在run()中不會執行this.wait(1000)的等待操作, 但是因爲還沒有客戶端連接註冊OP_READ事件所以選擇器不會捕獲該事件.
客戶端連接服務器,服務器接受連接,在doAccept()中, 註冊OP_READ到讀取線程的感興趣事件
1. 之前: 設置adding=true並激活讀取線程的選擇器, 注意此時讀取線程的選擇器進行輪詢操作是不會捕獲到讀取事件的,因爲還沒註冊OP_READ事件
所以讀取線程的run()如果判斷adding=true, 就知道選擇器關注的SocketChannel上的OP_READ事件還沒註冊好,需要每隔一秒鐘再判斷
2. 往建立的SocketChannel註冊好OP_READ事件
3. 之後: 設置adding=false並通知Reader讀取線程不需要再等待下去, run()方法判斷adding=false, 選擇器開始輪詢等待客戶端的寫入
Listener和Reader的選擇器
Listener的選擇器只有一個, Listener有多個Reader, 每個Reader都有自己的選擇器.
Listener的選擇器來監聽客戶端的連接, 當監聽到有一個客戶端連接服務器, 就會選取一個Reader, 並往Reader的選擇器註冊讀取操作.
這樣具體的讀取操作就交給了Reader進行處理. 因爲Reader有多個, 所以如果有多個客戶端連接並寫入數據給服務器, 就可以開多個Reader同時讀取.