HttpCommComponent
裏面有一個httpclient ,是一個靜態實例,也就是說在同一個jvm裏只有一個實例,可以重複使用,主要代碼:
static HttpClientclient;
static {
MultiThreadedHttpConnectionManagermgr =new MultiThreadedHttpConnectionManager();
mgr.getParams().setDefaultMaxConnectionsPerHost(20);
mgr.getParams().setMaxTotalConnections(10000);
mgr.getParams().setConnectionTimeout(SearchHandler.connectionTimeout);
mgr.getParams().setSoTimeout(SearchHandler.soTimeout);
// mgr.getParams().setStaleCheckingEnabled(false);
client = new HttpClient(mgr);
}
其中有兩個參數設置死了。。這兩個參數是作用於線程池,管理於http連接
/** The default maximum number of connections allowed per host */
public static final int DEFAULT_MAX_HOST_CONNECTIONS = 2; // Per RFC 2616 sec 8.1.4
/** The default maximum number of connections allowed overall */
public static final int DEFAULT_MAX_TOTAL_CONNECTIONS = 20;
設置於對應請求的目標主機線程數最多爲20條
mgr.getParams().setDefaultMaxConnectionsPerHost(20);
總共的線程數爲10000。
mgr.getParams().setMaxTotalConnections(10000);
具體如何分配可以看MultiThreadedHttpConnectionManagermgr的實現代碼,在不大於總線程的情況下,最多分配給某個目標主機最多20條線程。
這裏有個問題,如果 主要請求兩臺機器 ,那麼最終分配的線程數爲20*2=40條,在高併發情況下就會出現阻塞情況。所以對於高併發的線上服務來說,20是比較吝嗇的。。
這裏是一段httclient請求調用的方法,應該就是在高併發中,阻塞在getConnectionWithTimeout()這個方法中。。具體可能追綜下源代碼 .。
這裏爲了測試這個參數引起的問題,簡單實現了一個小程序,代碼如下:
package org.yzy.jetty;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpMethod;
import org.apache.commons.httpclient.MultiThreadedHttpConnectionManager;
import org.apache.commons.httpclient.methods.GetMethod;
public class HttpClientTest {
static HttpClient client;
static {
MultiThreadedHttpConnectionManager mgr = new MultiThreadedHttpConnectionManager();
mgr.getParams().setDefaultMaxConnectionsPerHost(2);
mgr.getParams().setMaxTotalConnections(10);
mgr.getParams().setConnectionTimeout(2000);
mgr.getParams().setSoTimeout(1000);
client = new HttpClient(mgr);
}
public static void main(String[] args) {
Thread t[]=new Thread[3];
for (int i = 0; i < t.length; i++) {
t[i]=new Thread(new Send());
}
for (int i = 0; i < t.length; i++) {
t[i].start();
}
}
public static class Send implements Runnable {
@Override
public void run() {
try{
HttpMethod method = new GetMethod("http://localhost:8080/solr");
System.out.println(Thread.currentThread().getName()+"-" +Thread.currentThread().getId() +":send");
int result = client.executeMethod(method);
System.out.println(Thread.currentThread().getName()+"-" +Thread.currentThread().getId() +":back" +result);
}catch(Exception e){
e.printStackTrace();
}
}
}
}
運行結果如下:
Thread-2-11:send
Thread-1-10:send
Thread-3-12:send
Thread-3-12:back200
Thread-2-11:back200
可以看到有一條沒有執行成功,一直阻塞中。。。
將Send類修改一下,代碼再改下:
public static class Send implements Runnable {
@Override
public void run() {
HttpMethod method = new GetMethod("http://localhost:8080/solr");
try{
System.out.println(Thread.currentThread().getName()+"-" +Thread.currentThread().getId() +":send");
int result = client.executeMethod(method);
System.out.println(Thread.currentThread().getName()+"-" +Thread.currentThread().getId() +":back" +result);
Thread.sleep(1000);
}catch(Exception e){
e.printStackTrace();
}finally{
method.releaseConnection();
System.out.println("relase..");
}
}
}
再運行:
Thread-3-12:send
Thread-1-10:send
Thread-2-11:send
Thread-2-11:back200
Thread-3-12:back200
relase..
relase..
Thread-1-10:back200
relase..
當有連接斷掉的時候,阻塞的線程可用。。完成請求。。
還有個問題,就是用戶請求solr時,分發爲三個請求,分別請求,主索引,小索引,專輯索引,最後發現,總是主索引拋出socke超時異常,又作何解釋呢:
首先,分發的三個請求是在多線程的情況下處理的,當主索引搜索時間過長,而小索引,專輯索引搜索時間較短,比較快地 releaseConnection,
所以相對大索引來說,其它兩上在同一時間內可用的連接比較多,相反大索引由於響應過慢,導致同一時間內佔握的連接超過了默認設置的20條連接。
所以纔會在大索引上產生請求阻塞。
至於爲什麼拋出socket超時異常,因爲solr的服務運行在tomcat上,tomcat 設置了連接超時等待,比如3000ms,這個時候,由於阻塞的連接沒完成,所以這個時候,tomcat主動拋棄了連接,最後看到的就是socket超時異常。。
因爲socket異常主要發生在等待讀取數據造成的。。。這就是我的分析。。。
當然solr的最新版本已解決了這個問題,連接池已改爲可以配置的形式。。
新版本的solr可配置方式,請看wiki.. http://wiki.apache.org/solr/SolrConfigXml/
<requestHandler name="standard" class="solr.SearchHandler" default="true">
<!-- other params go here -->
<shardHandlerFactory class="HttpShardHandlerFactory">
<int name="socketTimeOut">1000</int>
<int name="connTimeOut">5000</int>
</shardHandler>
</requestHandler>
The parameters that can be specified are as follows:
socketTimeout. default: 0 (use OS default) - The amount of time in ms that a socket is allowed to wait for
connTimeout. default: 0 (use OS default) - The amount of time in ms that is accepted for binding / connection a socket
maxConnectionsPerHost. default: 20 - The maximum number of connections that is made to each individual shard in a distributed search
corePoolSize. default: 0 - The retained lowest limit on the number of threads used in coordinating distributed search
maximumPoolSize. default: Integer.MAX_VALUE - The maximum number of threads used for coordinating distributed search
maxThreadIdleTime. default: 5 seconds - The amount of time to wait for before threads are scaled back in response to a reduction in load
sizeOfQueue. default: -1 - If specified the thread pool will use a backing queue instead of a direct handoff buffer. This may seem difficult to grasp, essentially high throughput systems will want to configure this to be a direct hand off (with -1). Systems that desire better latency will want to configure a reasonable size of queue to handle variations in requests.
fairnessPolicy. default: false - Chooses in the JVM specifics dealing with fair policy queuing, if enabled distributed searches will be handled in a First in First out fashion at a cost to throughput. If disabled throughput will be favoured over latency.
http://blog.csdn.net/duck_genuine/article/details/7839479