WebSocket官方文檔翻譯——HTML5 Web Sockets:A Quantum Leap in Scalability for the Web

HTML5 Web Sockets:A Quantum Leap in Scalability for the Web

HTML5 Web Sockets:對於網絡擴展性上是一個巨大的飛躍

By Peter Lubbers & Frank Greco, Kaazing Corporation

(This article has also been translated into Bulgarian.)

(該文已經被翻譯成了保加利亞語)

Lately there has been a lot of buzz around HTML5 Web Sockets, which defines a full-duplex communication channel that operates through a single socket over the Web. HTML5 Web Sockets is not just another incremental enhancement to conventional HTTP communications; it represents a colossal advance, especially for real-time, event-driven web applications.

近來圍繞着HTML5 Web Sockets有很大的動靜,它定義了一種在通過一個單一的socket在網絡上進行全雙工通訊的通道。它不僅僅是傳統的HTTP通訊的一個增量的提高,尤其對於實時、事件驅動的應用來說是一個飛躍。

HTML5 Web Sockets provides such a dramatic improvement from the old, convoluted "hacks" that are used to simulate a full-duplex connection in a browser that it prompted Google's Ian Hickson—the HTML5 specification lead—to say:

"Reducing kilobytes of data to 2 bytes…and reducing latency from 150ms to 50ms is far more than marginal. In fact, these two factors alone are enough to make Web Sockets seriously interesting to Google."

Let's take a look at how HTML5 Web Sockets can offer such an incredibly dramatic reduction of unnecessary network traffic and latency by comparing it to conventional solutions.

HTML5 Web Sockets相對於老的技術(在瀏覽器中模擬全雙工連接的複雜技術)有了如此巨大的提升,以致於谷歌的Ian Hickson—HTML5說明書的總編說:“把數據從kb減少到2b...,延遲時間從150ms減少到50ms,這遠遠不止是個微調。實際上,僅僅是這兩個事實就已經足夠讓谷歌對Web Sockets產生非常濃厚的興趣。”

讓我們通過對比常規的解決方案來看看HTML5 Web Sockets是如何在非必要的網絡傳輸和延遲性上提供如此難以置信的巨大降低。

Polling, Long-Polling, and Streaming—Headache 2.0

輪詢,長輪詢,流式頭痛2.0(Streaming——headache這個翻譯我自己都感覺蛋疼)

Normally when a browser visits a web page, an HTTP request is sent to the web server that hosts that page.  The web server acknowledges this request and sends back the response.  In many cases—for example, for stock prices, news reports, ticket sales, traffic patterns, medical device readings, and so on—the response could be stale by the time the browser renders the page. If you want to get the most up-to-date "real-time" information, you can constantly refresh that page manually, but that's obviously not a great solution.
一旦當一個瀏覽器訪問一個網頁時,會向擁有這個頁面的服務器發送一個HTTP請求。服務器認可這個請求併發送一個相應。在許多場景下,例如股票價格、新聞報道、售票、航線、醫療設備等等,在瀏覽器渲染頁面之後響應就變舊了。

Current attempts to provide real-time web applications largely revolve around polling and other server-side push technologies, the most notable of which is Comet, which delays the completion of an HTTP response to deliver messages to the client. Comet-based push is generally implemented in JavaScript and uses connection strategies such as long-polling or streaming.

現在嘗試提供實時的網絡應用一般都圍繞着輪詢和其它服務端的推送技術,最引人注意的是Comet,它延遲了發往客戶端的HTTP響應的結束。基於Comet的推送一般採用JavaScript 實現並使用長連接或流式的連接策略。

With polling, the browser sends HTTP requests at regular intervals and immediately receives a response.  This technique was the first attempt for the browser to deliver real-time information. Obviously, this is a good solution if the exact interval of message delivery is known, because you can synchronize the client request to occur only when information is available on the server. However, real-time data is often not that predictable, making unnecessary requests inevitable and as a result, many connections are opened and closed needlessly in low-message-rate situations.

輪詢過程中,瀏覽器在一個固定的間隔內發送HTTP請求並立即收到響應。這個技術瀏覽器發送實時數據的第一個嘗試。很顯然,如果發送的數據的時間間隔是明確的這是一個很好的解決方案,因爲你可以在服務器上的信息準備好後同步啓動客戶短的請求。但是,實時數據通常都是不可預期的,這必然造成許多不必要的請求,致使在低頻率消息情況下許多連接被不必要的打開和關閉。

With long-polling, the browser sends a request to the server and the server keeps the request open for a set period. If a notification is received within that period, a response containing the message is sent to the client. If a notification is not received within the set time period, the server sends a response to terminate the open request. It is important to understand, however, that when you have a high message volume, long-polling does not provide any substantial performance improvements over traditional polling.  In fact, it could be worse, because the long-polling might spin out of control into an unthrottled, continuous loop of immediate polls.

長輪詢過程中,瀏覽器向服務器發送一個請求,在一段時間內服務器保持這個請求打開。假如在這個時間內收到一個通知,包含信息的響應將發向客戶端。假如在一段時間內沒有收到通知,服務端發送響應去結束請求。一定要理解在大數據量消息的情況下,長輪詢相對於傳統的輪詢並不能提供大幅性能提升。實際上,它可能更糟,因爲它有可能失控成爲非節流(unthrottled),陷入立即推送的循環中。

With streaming, the browser sends a complete request, but the server sends and maintains an open response that is continuously updated and kept open indefinitely (or for a set period of time). The response is then updated whenever a message is ready to be sent, but the server never signals to complete the response, thus keeping the connection open to deliver future messages. However, since streaming is still encapsulated in HTTP, intervening firewalls and proxy servers may choose to buffer the response, increasing the latency of the message delivery. Therefore, many streaming Comet solutions fall back to long-polling in case a buffering proxy server is detected. Alternatively, TLS (SSL) connections can be used to shield the response from being buffered, but in that case the setup and tear down of each connection taxes the available server resources more heavily.

流式中,瀏覽器向服務器發送一個完整的請求,服務器發送一個響應並維護一個打開的響應,這個響應被持續更新並保持打開狀態。當有消息要發送的時候響應就會被更新,但是爲了能保證連接打開而一直能發送數據,服務端從來不會發送結束響應的信號。儘管如此,由於流式是建立在HTTP之上的,防火牆或代理服務器可能會緩存響應,,從而增加消息的延遲發送。因此當檢測到代理服務器時,流式Comet方案會退回到長連接。另一種選擇,使用TLS(SSL)連接可以防止響應被緩存,但是這種情況下創建和銷燬每一個連接將消耗更多的可用的服務器資源。

Ultimately, all of these methods for providing real-time data involve HTTP request and response headers, which contain lots of additional, unnecessary header data and introduce latency. On top of that, full-duplex connectivity requires more than just the downstream connection from server to client. In an effort to simulate full-duplex communication over half-duplex HTTP, many of today's solutions use two connections: one for the downstream and one for the upstream. The maintenance and coordination of these two connections introduces significant overhead in terms of resource consumption and adds lots of complexity. Simply put, HTTP wasn't designed for real-time, full-duplex communication as you can see in the following figure, which shows the complexities associated with building a Comet web application that displays real-time data from a back-end data source using a publish/subscribe model over half-duplex HTTP.
最後,所有這些提供實時數據的方式都會引入HTTP請求和響應頭,這包含很多額外的非必需的頭數據,從而增加了延遲性。加上全雙工並不僅僅是從服務端到客戶端的下行連接。在半雙工的HTTP上去模擬全雙工通訊,許多解決方案都使用兩個連接:一個上行一個下行。這兩個連接的維護和協議帶來了很大的資源消耗,增加了很多複雜性。簡而言之,HTTP並不是設計用來進行實時、全雙工通訊的,下圖展示了創建一個Comet網絡應用(在半雙工的HTTP上使用訂閱模式實時獲取後端數據)的複雜度。

Figure 1—The complexity of Comet applications

Comet headaches

It gets even worse when you try to scale out those Comet solutions to the masses. Simulating bi-directional browser communication over HTTP is error-prone and complex and all that complexity does not scale. Even though your end users might be enjoying something that looks like a real-time web application, this "real-time" experience has an outrageously high price tag. It's a price that you will pay in additional latency, unnecessary network traffic and a drag on CPU performance.
當你打算從Comet解決方案擴展到更大衆方案(masses)的時候可能會更糟糕。在HTTP模擬全雙工的瀏覽器通訊易出錯、複雜而且複雜度無法降低。儘管你的終端用戶非常喜歡實時性的網絡應用,這個“實時性”體驗帶來了巨大的代價。這個代價是支付在額外的延遲、非必要的網絡傳輸以及對CPU性能的影響上。

HTML5 Web Sockets to the Rescue!

HTML5 Web Sockets 救世主

Defined in the Communications section of the HTML5 specification, HTML5 Web Sockets represents the next evolution of web communications—a full-duplex, bidirectional communications channel that operates through a single socket over the Web. HTML5 Web Sockets provides a true standard that you can use to build scalable, real-time web applications. In addition, since it provides a socket that is native to the browser, it eliminates many of the problems Comet solutions are prone to. Web Sockets removes the overhead and dramatically reduces complexity.

定義在HTML5說明書中的交互部分,HTML5 Web Sockets代表了全雙工的網絡交互的下一個演變,在網絡上通過一個單一的socket搭建一個雙向通訊的通道。HTML5 Web Sockets提供了一個真正的標準,用來可擴展的、實時的網絡應用。另外,由於它創建了一個瀏覽器本地的socket,它避免了很多Comet方式遇到的問題。Web Sockets移除了開銷大幅度減輕了複雜度。

To establish a WebSocket connection, the client and server upgrade from the HTTP protocol to the WebSocket protocol during their initial handshake, as shown in the following example:

爲了建立一個WebSocket連接,客戶端和服務端在進行首次握手的時候把HTTP協議升級爲WebSocket協議,就像下面的例子示範的那樣:

Example 1—The WebSocket handshake (browser request and server response)

GET /text HTTP/1.1\r\n Upgrade: WebSocket\r\n Connection: Upgrade\r\n Host: www.websocket.org\r\n …\r\n 
HTTP/1.1 101 WebSocket Protocol Handshake\r\n Upgrade: WebSocket\r\n Connection: Upgrade\r\n …\r\n

Once established, WebSocket data frames can be sent back and forth between the client and the server in full-duplex mode. Both text and binary frames can be sent full-duplex, in either direction at the same time. The data is minimally framed with just two bytes. In the case of text frames, each frame starts with a 0x00 byte, ends with a 0xFF byte, and contains UTF-8 data in between. WebSocket text frames use a terminator, while binary frames use a length prefix.
一旦建立,WebSocket數據幀可以採用全雙工的模式在客戶端和服務端之間進行來回傳送。文本和二進制幀在任何方向任何時間都可以進行全雙工通訊。數據被最大限度的縮小到2b。如果是文本幀,每一個幀以0X00b開始,以0xFFb結尾,中間是UTF-8的數據。WebSocket文本幀使用終止符,二進制幀使用長度前綴。

Note: although the Web Sockets protocol is ready to support a diverse set of clients, it cannot deliver raw binary data to JavaScript, because JavaScript does not support a byte type. Therefore, binary data is ignored if the client is JavaScript—but it can be delivered to other clients that support it.
注意:儘管Web Sockets協議支持不同種類的客戶端,它不支持向JavaScript發送二進制數據,因爲JavaScript不支持二進制數據。因此,假如客戶端是JavaScript二進制數據會被忽略——但是可以把它發送給其它支持二進制的客戶端。

The Showdown: Comet vs. HTML5 Web Sockets

So how dramatic is that reduction in unnecessary network traffic and latency? Let's compare a polling application and a WebSocket application side by side.
那麼在非必要的網絡傳輸和延遲性上究竟減少了多少?讓我們一起比較一下長連接應用和WebSocket應用。

For the polling example, I created a simple web application in which a web page requests real-time stock data from a RabbitMQ message broker using a traditional publish/subscribe model. It does this by polling a Java Servlet that is hosted on a web server. The RabbitMQ message broker receives data from a fictitious stock price feed with continuously updating prices. The web page connects and subscribes to a specific stock channel (a topic on the message broker) and uses an XMLHttpRequest to poll for updates once per second. When updates are received, some calculations are performed and the stock data is shown in a table as shown in the following image.
在長連接的例子中,我創建了一個簡單的網絡應用:一個網頁使用傳統的發佈/訂閱模式從RabbitMQ消息隊列中獲取實時股票信息。它通過與網絡服務器上的Java Servlet進行長連接實現。RabbitMQ消息隊列從虛構的持續改變股票價格的股票價格服務接收數據。網頁連接並訂閱一個股票通道,然後使用XMLHttpRequest每秒更新一次。當接收到更新後,會進行一些計算,然後在下圖所示的表格中展示股票數據。

Figure 2—A JavaScript stock ticker application
圖2—一個JavaScript股票應用

Stock ticker application example

Note: The back-end stock feed actually produces a lot of stock price updates per second, so using polling at one-second intervals is actually more prudent than using a Comet long-polling solution, which would result in a series of continuous polls. Polling effectively throttles the incoming updates here.
注意:後臺的股票服務實際上每秒產生許多價格上的更新,因此每秒輪詢一次比使用Comet長連接更節省,後者會造成大量的持續的輪詢。這裏輪詢有效的節制了數據更新。

It all looks great, but a look under the hood reveals there are some serious issues with this application. For example, in Mozilla Firefox with Firebug (a Firefox add-on that allows you to debug web pages and monitor the time it takes to load pages and execute scripts), you can see that GET requests hammer the server at one-second intervals. Turning on Live HTTP Headers(another Firefox add-on that shows live HTTP header traffic) reveals the shocking amount of header overhead that is associated with each request. The following two examples show the HTTP header data for just a single request and response.
它看上去很不錯,但是揭開面具,這裏有許多嚴重的問題。例如,在Mozilla Firefox中使用Firebug(一個火狐插件——可以對網頁進行deb、跟蹤加載頁面和執行腳本的時間),你可以看到每隔一秒GET請求就去連接服務器。打開Live HTTP Headers(另外一個火狐插件——可以顯示活躍 HTTP 頭傳輸)暴露了每一個連接上巨大數量的頭開銷(header overhead)。下面的例子展示了一個請求和響應的頭信息。

Example 2—HTTP request header

GET /PollingStock//PollingStock HTTP/1.1
 Host: localhost:8080
 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.5) 
Gecko/20091102 Firefox/3.5.5
 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 Accept-Language: en-us
 Accept-Encoding: gzip,deflate
 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
 Keep-Alive: 300
 Connection: keep-alive
 Referer: http://www.example.com/PollingStock/
 Cookie: showInheritedConstant=false; showInheritedProtectedConstant=false; 
showInheritedProperty=false; showInheritedProtectedProperty=false; 
showInheritedMethod=false; showInheritedProtectedMethod=false; 
showInheritedEvent=false; showInheritedStyle=false; showInheritedEffect=false

Example 3—HTTP response header

HTTP/1.x 200 OK
 X-Powered-By: Servlet/2.5
 Server: Sun Java System Application Server 9.1_02
 Content-Type: text/html;charset=UTF-8
 Content-Length: 21
 Date: Sat, 07 Nov 2009 00:32:46 GMT

Just for fun, I counted all the characters. The total HTTP request and response header information overhead contains 871 bytes and that does not even include any data! Of course, this is just an example and you can have less than 871 bytes of header data, but I have also seen cases where the header data exceeded 2000 bytes. In this example application, the data for a typical stock topic message is only about 20 characters long. As you can see, it is effectively drowned out by the excessive header information, which was not even required in the first place!
僅僅爲了好玩,我數了數所有的字符。整個HTTP請求和相應的頭信息中包含了871位還不包含任何數據!當然這僅僅是個例子,你可以有少於871位的頭數據,但是我還見過頭數據超過2000位的情況。在這個例子中,典型股票標題信息僅僅20個字符長。就像你看到的,它被過多的頭信息淹沒了,這些頭信息甚至起初都不需要。

So, what happens when you deploy this application to a large number of users? Let's take a look at the network throughput for just the HTTP request and response header data associated with this polling application in three different use cases.
那麼當你把這個應用部署到大用戶量的場景下會怎麼樣?讓我們計算一下與這個輪詢應用相關的HTTP請求和響應的頭數據在三種場景下的網絡吞吐量。

  • Use case A: 1,000 clients polling every second: Network throughput is (871 x 1,000) = 871,000 bytes = 6,968,000 bits per second (6.6 Mbps)
    A:每秒1000個客戶端輪詢,每秒的網絡流量是6.6 Mbps。

  • Use case B: 10,000 clients polling every second: Network throughput is (871 x 10,000) = 8,710,000 bytes = 69,680,000 bits per second (66 Mbps)
    B:每秒10,000個客戶端輪詢,每秒的網絡流量是66 Mbps。

  • Use case C: 100,000 clients polling every 1 second: Network throughput is (871 x 100,000) = 87,100,000 bytes = 696,800,000 bits per second (665 Mbps)
    C:每秒100,000個客戶端輪詢,每秒的網絡流量是665 Mbps。

That's an enormous amount of unnecessary network throughput! If only we could just get the essential data over the wire. Well, guess what? You can with HTML5 Web Sockets! I rebuilt the application to use HTML5 Web Sockets, adding an event handler to the web page to asynchronously listen for stock update messages from the message broker (check out the many how-tos and tutorials on tech.kaazing.com/documentation/ for more information on how to build a WebSocket application). Each of these messages is a WebSocket frame that has just two bytes of overhead (instead of 871)! Take a look at how that affects the network throughput overhead in our three use cases.
這是一個大量的非必要網絡流量。假如我們可以從網絡上僅僅獲得必要的數據。好了,猜到什麼了?對了,你可以使用HTML5 Web Sockets。我重構了應用去使用HTML5 Web Sockets,在網頁上增加了一個事件處理器去異步監聽來自於代理的股票更新信息。每一個信息都是一個WebSocket幀,僅僅有2位的開銷(而不是871)!看一下在三種場景下它是如何影響網絡流量消耗的。

  • Use case A: 1,000 clients receive 1 message per second: Network throughput is (2 x 1,000) = 2,000 bytes = 16,000 bits per second (0.015 Mbps)
    A1000個客戶端每秒接收到一個消息,每秒的網絡流量是0.015 Mbps

  • Use case B: 10,000 clients receive 1 message per second: Network throughput is (2 x 10,000) = 20,000 bytes = 160,000 bits per second (0.153 Mbps)
    B10,000個客戶端每秒接收到一個消息,每秒的網絡流量是0.15 Mbps

  • Use case C: 100,000 clients receive 1 message per second: Network throughput is (2 x 100,000) = 200,000 bytes = 1,600,000 bits per second (1.526 Mbps)
    C100,000個客戶端每秒接收到一個消息,每秒的網絡流量是1.526 Mbps

As you can see in the following figure, HTML5 Web Sockets provide a dramatic reduction of unnecessary network traffic compared to the polling solution.
在下面的圖中你可以看到,與輪詢方案相比,HTML5 Web Sockets對非必要網絡傳輸提供了巨大的減少。

Figure 3—Comparison of the unnecessary network throughput overhead between the polling and the WebSocket applications
3—輪詢方案和WebSocket應用在非必要網絡消耗上的比較

Polling versus Web Sockets unnecessary header overhead

And what about the reduction in latency? Take a look at the following figure. In the top half, you can see the latency of the half-duplex polling solution. If we assume, for this example, that it takes 50 milliseconds for a message to travel from the server to the browser, then the polling application introduces a lot of extra latency, because a new request has to be sent to the server when the response is complete. This new request takes another 50ms and during this time the server cannot send any messages to the browser, resulting in additional server memory consumption.
在延遲性減少上又如何呢?看看下面的圖表。在上半部分,你可以看到半雙工輪詢方案的延遲。假定,在這個例子中,一個信息從服務端發往瀏覽器需要消耗50毫秒,輪詢應用導致了許多額外的延遲,因爲當響應完成時,一個新的請求將會發往服務端。這個新請求又花費了50毫秒,在這個時間內服務端不能向客戶端發送任何消息,這導致服務端額外的內存消耗。

In the bottom half of the figure, you see the reduction in latency provided by the WebSocket solution. Once the connection is upgraded to WebSocket, messages can flow from the server to the browser the moment they arrive. It still takes 50 ms for messages to travel from the server to the browser, but the WebSocket connection remains open so there is no need to send another request to the server.
圖表的下半部分,你可以看到WebSocket方案在延遲性上的減少。一旦連接被升級爲WebSocket,消息一旦到達就會從服務端發往客戶端。消息從服務端發往瀏覽器仍然花費50毫秒,但是WebSocket連接保持打開,所以不需要再向服務端發送另一個請求。

Figure 4—Latency comparison between the polling and WebSocket applications

Web Sockets versus Comet latency comparison

HTML5 Web Sockets and the Kaazing WebSocket Gateway

Today, only Google's Chrome browser supports HTML5 Web Sockets natively, but other browsers will soon follow. To work around that limitation, however, Kaazing WebSocket Gateway provides complete WebSocket emulation for all the older browsers (I.E. 5.5+, Firefox 1.5+, Safari 3.0+, and Opera 9.5+), so you can start using the HTML5 WebSocket APIs today.
現在,僅有谷歌瀏覽器原生支持HTML5 Web Sockets,但是其它瀏覽器馬上會支持。爲了解決這個限制,Kaazing WebSocket Gateway爲所有老瀏覽器提供了一個完全的WebSocket 仿真,所以你可以從今天開始使用HTML5 WebSocket APIs

WebSocket is great, but what you can do once you have a full-duplex socket connection available in your browser is even greater. To leverage the full power of HTML5 Web Sockets, Kaazing provides a ByteSocket library for binary communication and higher-level libraries for protocols like Stomp, AMQP, XMPP, IRC and more, built on top of WebSocket.
WebSocket很偉大,但是當你的瀏覽器中有一個全雙工的socket連接的時候,你可以做什麼更偉大。爲了有效利用HTML5 Web Sockets的能力,KaazingWebSocket之上爲二進制交互提供了ByteSocket,在Stomp, AMQP, XMPP, IRC等等之上提供了更高層次的函數庫。

Figure 5—Kaazing WebSocket Gateway extends TCP-based messaging to the browser with ultra high performance

Kaazing Web Sockets Architecture

Summary

HTML5 Web Sockets provides an enormous step forward in the scalability of the real-time web. As you have seen in this article, HTML5 Web Sockets can provide a 500:1 or—depending on the size of the HTTP headers—even a 1000:1 reduction in unnecessary HTTP header traffic and 3:1 reduction in latency. That is not just an incremental improvement; that is a revolutionary jump—a quantum leap!
HTML5 Web Sockets在實時網絡的擴展性上提供了一個巨大的進步。就像你在本文中看到的,依賴於HTTP頭體積,HTML5 Web Sockets可以提供5001甚至10001的非必要HTTP頭信息傳輸的變少,以及31延遲性的降低。這不僅僅是個進步,它是巨大的跳躍一個飛躍。

Kaazing WebSocket Gateway makes HTML5 WebSocket code work in all the browsers today, while providing additional protocol libraries that allow you to harness the full power of the full-duplex socket connection that HTML5 Web Sockets provides and communicate directly to back-end services. For more information about Kaazing WebSocket Gateway, visit kaazing.com and the Kaazing technology network at tech.kaazing.com.
Kaazing WebSocket Gateway使HTML5 WebSocket可以工作在當前所有的瀏覽器上,提供了額外的協議函數庫允許你去利用全雙工socket通訊的全部力量,並同後臺直接通訊。要了解更多的關於 Kaazing WebSocket Gateway的信息,請訪問kaazing.com 以及技術網站tech.kaazing.com上的Kaazing

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章