urllib模塊的使用,這個是用來讀取指定的url,然後讀取其內容:
In [19]: import urllib In [20]: url_file=urllib.urlopen("http://baike.baidu.com/view/21087.htm?fr=aladdin") In [21]: url_file.close() In [22]: len(urllib_docs) Out[22]: 151836 In [23]: urllib_docs[151400:] Out[23]: 'extarea{line-height:1.3em!important}</style><script type="text/javascript">SyntaxHighlighter.all();</script><script type="text/javascript">F.use("/static/common/ui/jquery/jquery.js",function($){$.ajax({url:\'/cms/global/cms_lemma_config.js?r=\'+Math.random(),dataType:"jsonp"});});</script><!--TRACE\xef\xbc\x9a10-36-24-34@2014-09-28 15:15:39--></body></html><!--24606544450858141194092822-->\n<script> var _trace_page_logid = 2460654445; </script>'