學習urblib和requests

學習urblib和requests

  • urblib是python的一個自建模塊,它提供了一系列用於操作URL的功能。

  • 而第三方模塊requests是對urllib的人性化封裝。

requests

urllib

  • urllib是基於http的高層庫,它有以下三個主要功能:

    • request處理客戶端的請求
    • response處理服務端的響應
    • parse會解析url
  • 下面是使用Python3中urllib來獲取資源的一些示例

    • 最簡單的抓取url內容。

      import urllib.request  
      response = urllib.request.urlopen('http://python.org/')  
      html = response.read()  
    • 使用response處理服務器的相應。

      import urllib.request  
      req = urllib.request.Request('http://python.org/')  
      response = urllib.request.urlopen(req)  
      the_page = response.read()  
    • 發送數據

      import urllib.parse  
      import urllib.request  
      url = '"  
      values = {  
      'act' : 'login',  
      'login[email]' : '',  
      'login[password]' : ''  
      }  
      data = urllib.parse.urlencode(values)  
      req = urllib.request.Request(url, data)  
      req.add_header('Referer', 'http://www.python.org/')  
      response = urllib.request.urlopen(req)  
      the_page = response.read()  
      print(the_page.decode("utf8"))  
    • 發送數據和header

      import urllib.parse  
      import urllib.request  
      url = ''  
      user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'  
      values = {  
      'act' : 'login',  
      'login[email]' : '',  
      'login[password]' : ''  
      }  
      headers = { 'User-Agent' : user_agent }  
      data = urllib.parse.urlencode(values)  
      req = urllib.request.Request(url, data, headers)  
      response = urllib.request.urlopen(req)  
      the_page = response.read()  
      print(the_page.decode("utf8"))  
    • 處理http 錯誤

      import urllib.request  
      req = urllib.request.Request(' ')  
      try:  
          urllib.request.urlopen(req)  
      except urllib.error.HTTPError as e:  
          print(e.code)  
          print(e.read().decode("utf8"))  
    • 異常處理1

      from urllib.request import Request, urlopen  
      from urllib.error import URLError, HTTPError  
      req = Request("http://www..net /")  
      try:  
          response = urlopen(req)  
      except HTTPError as e:  
          print('The server couldn't fulfill the request.')  
          print('Error code: ', e.code)  
      except URLError as e:  
          print('We failed to reach a server.')  
          print('Reason: ', e.reason)  
      else:  
          print("good!")  
          print(response.read().decode("utf8"))  
    • 異常處理2

      from urllib.request import Request, urlopen  
      from urllib.error import  URLError  
      req = Request("http://www.Python.org/")  
      try:  
          response = urlopen(req)  
      except URLError as e:  
          if hasattr(e, 'reason'):  
              print('We failed to reach a server.')  
              print('Reason: ', e.reason)  
      elif hasattr(e, 'code'):  
          print('The server couldn't fulfill the request.')  
          print('Error code: ', e.code)  
      else:  
          print("good!")  
          print(response.read().decode("utf8"))  
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章