HttpClient學習
(1)下面列舉幾個主要的Http相關概念的類
類名 | 描述 |
---|---|
HttpClient | 建立請求客戶端 |
HttpGet | 代表請求方法,類似的還有HttpHead, HttpPost, HttpPut, HttpDelete, HttpTrace, HttpOptions等 |
HttpResponse | 表示請求的響應(包括響應狀態、協議等頭信息,Header封裝各種頭信息,頭信息又包括HeaderElement,都可以採用迭代器的方式進行迭代讀取) |
HttpEntity | 表示相應的實體,用於存放傳送的內容,也就是body體,存在於request和response中,request只有post和put方法有,response中都有Entity,除了一些特殊情況不包含內容。Entity根據來源分爲三種:streamed,一次讀取;wrapping,從其他entity封裝;self-contained,從內存中讀取,可反覆讀。 |
URIBuilder | 工具類用來生成url,主要是設置協議、域名和路徑,還有各種參數等 |
(2)HttpEntity的幾個主要函數
函數 | 描述 |
---|---|
getContentType | 獲取內容類型 |
getContentLength | 獲取內容長度 |
getContent | 獲取內容的輸入流InputStream |
- HttpEntity entity=response.getEntity();
- System.out.println(entity.getContentType());
- System.out.println(entity.getContentLength());
- InputStream in=entity.getContent();//直接獲取輸入流,一次讀取
(3)HttpEntity直接獲取的streamed流
- 只能讀取一次,如果想讀取多次,就要進行緩存,利用wrapping方式將streamed進行包裝BufferedHttpEntity
- BufferedHttpEntity bufEntity=new BufferedHttpEntity(entity);//通過構造形式封裝進緩存,可多次讀取
(4)HttpEntity也可放在post和put方法的請求中
- 作爲請求傳遞的內容。內容可以是文件,也可以提交form參數
- File file=new File("out.txt");
- FileEntity fileEntity=new FileEntity(file, ContentType.create("text/plain", "UTF-8"));//文件內容輸入
- List<NameValuePair> formparams = new ArrayList<NameValuePair>();
- formparams.add(new BasicNameValuePair("param1", "value1"));
- formparams.add(new BasicNameValuePair("param2", "value2"));
- UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(formparams, "UTF-8");//form表單內容輸入
- HttpPost post=new HttpPost("http://www.baidu.com");
- post.setEntity(fileEntity);
(5)response處理類最方便的是ResponseHandler類,它的功能是將entity轉化爲不同的內容格式
- ResponseHandler<byte[]> handler = new ResponseHandler<byte[]>() {
- public byte[] handleResponse(
- HttpResponse response) throws ClientProtocolException, IOException {
- HttpEntity entity = response.getEntity();
- if (entity != null) {
- return EntityUtils.toByteArray(entity);
- } else {
- return null;
- }
- }
- };
- byte[] response = httpclient.execute(httpget, handler);
- ResponseHandler<String> handler1=new BasicResponseHandler();
- String response1= httpclient.execute(httpget,handler1);
(6)request請求時可以設置一些http參數httpparam
和httpcontext相似,httpclient可以設置客戶端範圍的,httprequest也可以設置,但是請求範圍的。
參數名 | 描述 |
---|---|
CoreProtocolPNames.PROTOCOL_VERSION='http.protocol.version' | 協議版本 |
CoreProtocolPNames.HTTP_ELEMENT_CHARSET='http.protocol.element-charset' | 協議元素編碼 |
CoreProtocolPNames.HTTP_CONTENT_CHARSET='http.protocol.content-charset' | 協議內容編碼 |
CoreProtocolPNames.USER_AGENT='http.useragent' | 用戶端,寫爬蟲的時候有用 |
CoreProtocolPNames.STRICT_TRANSFER_ENCODING='http.protocol.strict-transfer-encoding' | (... |
CoreProtocolPNames.USE_EXPECT_CONTINUE='http.protocol.expect-continue' | ... |
CoreProtocolPNames.WAIT_FOR_CONTINUE='http.protocol.wait-for-continue' | ... |
- httpclient.getParams().setParameter(CoreProtocolPNames.PROTOCOL_VERSION,
- HttpVersion.HTTP_1_0); // Default to HTTP 1.0
- httpclient.getParams().setParameter(CoreProtocolPNames.HTTP_CONTENT_CHARSET,
- "UTF-8");
- HttpGet httpget = new HttpGet("http://www.google.com.hk/");
- httpget.getParams().setParameter(CoreProtocolPNames.PROTOCOL_VERSION,
- HttpVersion.HTTP_1_1); // Use HTTP 1.1 for this request only
- httpget.getParams().setParameter(CoreProtocolPNames.USE_EXPECT_CONTINUE,
- Boolean.FALSE);
(7)httpclient完成了對connection的控制
但是上面的方法都沒有涉及連接的設置,這裏提供一些參數可以進行設置通過HttpParam設置
參數 | 描述 |
---|---|
CoreConnectionPNames.SO_TIMEOUT='http.socket.timeout' | 等待數據的最大時間,也就是兩段連續數據讀取之間的間隔 |
CoreConnectionPNames.TCP_NODELAY='http.tcp.nodelay' | bool值,設置是否應用Naple算法,該算法最小化發送的包數,因此每個包很大,佔帶寬,有延遲 |
CoreConnectionPNames.SOCKET_BUFFER_SIZE='http.socket.buffer-size' | 設置接發數據的緩衝區大小 |
CoreConnectionPNames.SO_LINGER='http.socket.linger' | ... |
CoreConnectionPNames.CONNECTION_TIMEOUT='http.connection.timeout' | 設置連接超時 |
CoreConnectionPNames.STALE_CONNECTION_CHECK='http.connection.stalecheck' | ... |
CoreConnectionPNames.MAX_LINE_LENGTH='http.connection.max-line-length' | 設置每行最大長度 |
CoreConnectionPNames.MAX_HEADER_COUNT='http.connection.max-header-count' | 設置頭最大數量 |
ConnConnectionPNames.MAX_STATUS_LINE_GARBAGE='http.connection.max-status-line-garbage' | ... |
(8)實際應用的中,從連接池裏獲取連接是比較好的方法,連接池負責管理連接。
- BasicClientConnectionManager man=new BasicClientConnectionManager();//最基本的連接池,一次只維護一個連接
- System.out.println(httpclient.getConnectionManager().getClass());//輸出class org.apache.http.impl.conn.BasicClientConnectionManager
- //下面採用PoolingClientConnectionManager連接池管理,該連接池支持多線程操作
- if(httpConnManger==null)
- {
- SchemeRegistry schemeRegistry = new SchemeRegistry();
- schemeRegistry.register(
- new Scheme("http", 80, PlainSocketFactory.getSocketFactory()));
- httpConnManger=new PoolingClientConnectionManager(schemeRegistry);
- httpConnManger.setMaxTotal(10);
- httpConnManger.setDefaultMaxPerRoute(20);
- }
- HttpParams params=new BasicHttpParams();
- params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, CONNECTION_TIME);
- HttpClient httpClient=new DefaultHttpClient(httpConnManger,params);
- HttpGet httpGet=new HttpGet(urlAddr);
- HttpResponse response;
- try {
- response = httpClient.execute(httpGet);
- } catch (ClientProtocolException e) {
- log.error(e.getMessage());
- return null;
- } catch (IOException e) {
- log.error(e.getMessage());
- return null;
- }
(9)需要代理的請求,設置HttpProxy
- HttpHost proxy = new HttpHost("127.0.0.1", 8080, "http");
- DefaultHttpClient httpclient = new DefaultHttpClient();
- httpclient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);
- HttpHost target = new HttpHost("issues.apache.org", 443, "https");
- HttpGet req = new HttpGet("/");
- System.out.println("executing request to " + target + " via " + proxy);
- HttpResponse rsp = httpclient.execute(target, req);
- HttpEntity entity = rsp.getEntity();
(10)需要登錄驗證的請求
- httpclient.getCredentialsProvider().setCredentials(
- new AuthScope("localhost", 443),
- new UsernamePasswordCredentials("username", "password"));
- HttpGet httpget = new HttpGet("https://localhost/protected");
- System.out.println("executing request" + httpget.getRequestLine());
- HttpResponse response = httpclient.execute(httpget);
- HttpEntity entity = response.getEntity();
- System.out.println("----------------------------------------");
- System.out.println(response.getStatusLine());
- if (entity != null) {
- System.out.println("Response content length: " + entity.getContentLength());
- }
- EntityUtils.consume(entity);