Thrift Java 服務器端比較

本文是我對這篇文章的翻譯:Thrift Java Servers Compared,爲了便於閱讀,我將原文附於此處,翻譯穿插在其中。此外,爲了防止原鏈接在未來某一天失效後,文中的圖片再也看不到的問題,我將原文中的圖片也保存到了本站的服務器上,我不知道github或原作者是否允許這樣做,但我翻譯本文僅在於傳播知識的目的,在此向原作者和github表示深深的感謝:感謝你們分享了這樣好的文章。
       
       
       
Thrift Java Servers Compared
This article talks only about Java servers. See this page if you are interested in C++ servers.
本文僅討論Java版的Thrift server.如果你對C++版的感興趣,請參考 這個 頁面。

Thrift is a cross-language serialization/RPC framework with three major components, protocol, transport, and server. Protocol defines how messages are serialized. Transport defines how messages are communicated between client and server. Server receives serialized messages from the transport, deserializes them according to the protocol and invokes user-defined message handlers, and serializes the responses from the handlers and writes them back to the transport. The modular architecture of Thrift allows it to offer various choices of servers. Here are the list of server available for Java:
Thrift 是一個跨語言的序列化/RPC框架,它含有三個主要的組件:protocol,transport和server,其中,protocol定義了消息是怎樣序列化的,transport定義了消息是怎樣在客戶端和服務器端之間通信的,server用於從transport接收序列化的消息,根據protocol反序列化之,調用用戶定義的消息處理器,並序列化消息處理器的響應,然後再將它們寫回transport。Thrift模塊化的結構使得它能提供各種server實現。下面列出了Java中可用的server實現:
· TSimpleServer
· TNonblockingServer
· THsHaServer
· TThreadedSelectorServer
· TThreadPoolServer
Having choices is great, but which server is right for you? In this article, I'll describe the differences among all those servers and show benchmark results to illustrate performance characteristics (the details of the benchmark is explained in Appendix B). Let's start with the simplest one: TSimpleServer.
有多個選擇很好,但是哪個適合你呢?在本文中,我將描述這些server之間的區別,並展示測試結果,以說明它們的性能特點(測試的細節在附錄B中)。下面,我們就從最簡單的開始:TSimpleServer。
文章來源:http://www.codelast.com/

TSimpleServer
TSimpleServer
 accepts a connection, processes requests from the connection until the client closes the connection, and goes back to accept a new connection. Since it is all done in a single thread with blocking I/O, it can only serve one client connection, and all the other clients will have to wait until they get accepted. TSimpleServer is mainly used for testing purpose. Don't use it in production!
TSimplerServer接受一個連接,處理連接請求,直到客戶端關閉了連接,它纔回去接受一個新的連接。正因爲它只在一個單獨的線程中以阻塞I/O的方式完成這些工作,所以它只能服務一個客戶端連接,其他所有客戶端在被服務器端接受之前都只能等待。TSimpleServer主要用於測試目的,不要在生產環境中使用它!
文章來源:http://www.codelast.com/

TNonblockingServer vs. THsHaServer
TNonblockingServer
 solves the problem with TSimpleServer of one client blocking all the other clients by using non-blocking I/O. It usesjava.nio.channels.Selector, which allows you to get blocked on multiple connections instead of a single connection by calling select(). The select() call returns when one ore more connections are ready to be accepted/read/written. TNonblockingServer handles those connections either by accepting it, reading data from it, or writing data to it, and calls select() again to wait for the next available connections. This way, multiple clients can be served without one client starving others.
TNonblockingServer使用非阻塞的I/O解決了TSimpleServer一個客戶端阻塞其他所有客戶端的問題。它使用了java.nio.channels.Selector,通過調用select(),它使得你阻塞在多個連接上,而不是阻塞在單一的連接上。當一或多個連接準備好被接受/讀/寫時,select()調用便會返回。TNonblockingServer處理這些連接的時候,要麼接受它,要麼從它那讀數據,要麼把數據寫到它那裏,然後再次調用select()來等待下一個可用的連接。通用這種方式,server可同時服務多個客戶端,而不會出現一個客戶端把其他客戶端全部“餓死”的情況。

There is a catch, however. Messages are processed by the same thread that calls select(). Let's say there are 10 clients, and each message takes 100 ms to process. What would be the latency and throughput? While a message is being processed, 9 clients are waiting to be selected, so it takes 1 second for the clients to get the response back from the server, and throughput will be 10 requests / second. Wouldn't it be great if multiple messages can be processed simultaneously?
然而,還有個棘手的問題:所有消息是被調用select()方法的同一個線程處理的。假設有10個客戶端,處理每條消息所需時間爲100毫秒,那麼,latency和吞吐量分別是多少?當一條消息被處理的時候,其他9個客戶端就等着被select,所以客戶端需要等待1秒鐘才能從服務器端得到迴應,吞吐量就是10個請求/秒。如果可以同時處理多條消息的話,會很不錯吧?

This is where THsHaServer (Half-Sync/Half-Async server) comes into picture. It uses a single thread for network I/O, and a separate pool of worker threads to handle message processing. This way messages will get processed immediately if there is an idle worker threads, and multiple messages can be processed concurrently. Using the example above, now the latency is 100 ms and throughput will be 100 requests / sec.
因此,THsHaServer(半同步/半異步的server)就應運而生了。它使用一個單獨的線程來處理網絡I/O,一個獨立的worker線程池來處理消息。這樣,只要有空閒的worker線程,消息就會被立即處理,因此多條消息能被並行處理。用上面的例子來說,現在的latency就是100毫秒,而吞吐量就是100個請求/秒。

To demonstrate this, I ran a benchmark with 10 clients and a modified message handler that simply sleeps for 100 ms before returning. I used THsHaServer with 10 worker threads. The handler looks something like this:
爲了演示,我做了一個測試,有10客戶端和一個修改過的消息處理器——它的功能僅僅是在返回之前簡單地sleep 100毫秒。我使用的是有10個worker線程的THsHaServer。消息處理器的代碼看上去就像下面這樣:

1
2
3
4
5
6
7
8
publicResponseCode sleep() throwsTException
{  
    try{
        Thread.sleep(100);
    }catch(Exception ex) {
    }
    returnResponseCode.Success;
}
thrift-java-servers-compared
thrift-java-servers-compared

The results are as expected. THsHaServer is able to process all the requests concurrently, while TNonblockingServer processes requests one at a time.

結果正如我們想像的那樣,THsHaServer能夠並行處理所有請求,而TNonblockingServer只能一次處理一個請求。
文章來源:http://www.codelast.com/

THsHaServer vs. TThreadedSelectorServer

Thrift 0.8 introduced yet another server, TThreadedSelectorServer. The main difference between TThreadedSelectorServer and THsHaServer is that TThreadedSelectorServer allows you to have multiple threads for network I/O. It maintains 2 thread pools, one for handling network I/O, and one for handling request processing. TThreadedSelectorServer performs better than THsHaServer when the network io is the bottleneck. To show the difference, I ran a benchmark with a handler that returns immediately without doing anything, and measured the average latency and throughput with varying number of clients. I used 32 worker threads for THsHaServer, and 16 worker threads/16 selector threads for TThreadedSelectorServer.
Thrift 0.8引入了另一種server實現,即TThreadedSelectorServer。它與THsHaServer的主要區別在於,TThreadedSelectorServer允許你用多個線程來處理網絡I/O。它維護了兩個線程池,一個用來處理網絡I/O,另一個用來進行請求的處理。當網絡I/O是瓶頸的時候,TThreadedSelectorServer比THsHaServer的表現要好。爲了展現它們的區別,我進行了一個測試,令其消息處理器在不做任何工作的情況下立即返回,以衡量在不同客戶端數量的情況下的平均latency和吞吐量。對THsHaServer,我使用32個worker線程;對TThreadedSelectorServer,我使用16個worker線程和16個selector線程。

thrift-java-servers-compared
thrift-java-servers-compared

The result shows that TThreadedSelectorServer has much higher throughput than THsHaServer while maintaining lower latency.
結果顯示,TThreadedSelectorServer比THsHaServer的吞吐量高得多,並且維持在一個更低的latency上。
文章來源:http://www.codelast.com/

TThreadedSelectorServer vs. TThreadPoolServer

Finally, there is TThreadPoolServer. TThreadPoolServer is different from the other 3 servers in that:
最後,還剩下 TThreadPoolServer。TThreadPoolServer與其他三種server不同的是:
·         There is a dedicated thread for accepting connections.
·         有一個專用的線程用來接受連接。
·         Once a connection is accepted, it gets scheduled to be processed by a worker thread in ThreadPoolExecutor.
·         一旦接受了一個連接,它就會被放入ThreadPoolExecutor中的一個worker線程裏處理。
·         The worker thread is tied to the specific client connection until it's closed. Once the connection is closed, the worker thread goes back to the thread pool.
·         worker線程被綁定到特定的客戶端連接上,直到它關閉。一旦連接關閉,該worker線程就又回到了線程池中。
·         You can configure both minimum and maximum number of threads in the thread pool. Default values are 5 and Integer.MAX_VALUE, respectively.
·         你可以配置線程池的最小、最大線程數,默認值分別是5(最小)和Integer.MAX_VALUE(最大)。

This means that if there are 10000 concurrent client connections, you need to run 10000 threads. As such, it is not as resource friendly as other servers. Also, if the number of clients exceeds the maximum number of threads in the thread pool, requests will be blocked until a worker thread becomes available.
這意味着,如果有1萬個併發的客戶端連接,你就需要運行1萬個線程。所以它對系統資源的消耗不像其他類型的server一樣那麼“友好”。此外,如果客戶端數量超過了線程池中的最大線程數,在有一個worker線程可用之前,請求將被一直阻塞在那裏。

Having said that, TThreadPoolServer performs very well; on the box I'm using it's able to support 10000 concurrent clients without any problem. If you know the number of clients that will be connecting to your server in advance and you don't mind running a lot of threads, TThreadPoolServer might be a good choice for you.
我們已經說過,TThreadPoolServer的表現非常優異。在我正在使用的計算機上,它可以支持1萬個併發連接而沒有任何問題。如果你提前知道了將要連接到你服務器上的客戶端數量,並且你不介意運行大量線程的話,TThreadPoolServer對你可能是個很好的選擇。

thrift-java-servers-compared
thrift-java-servers-compared

文章來源:http://www.codelast.com/

Conclusion
結論

I hope this article helps you decide which Thrift server is right for you. I think TThreadedSelectorServer would be a safe choice for most of the use cases. You might also want to consider TThreadPoolServer if you can afford to run lots of concurrent threads. Feel free to send me email [email protected] or post your comments here if you have any questions/comments.
希望本文能幫你做出決定:哪一種Thrift server適合你。我認爲TThreadedSelectorServer對大多數案例來說都是個安全之選。如果你的系統資源允許運行大量併發線程的話,你可能會想考慮使用TThreadPoolServer。(後面的就不翻譯了)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章