python測試工具開發面試寶典3web抓取

用requests輸出網站返回頭

輸出 'https://china-testing.github.io/' 的返回頭

  • 參考答案
In [1]: import requests

In [2]: url = 'https://china-testing.github.io/'

In [3]: response = requests.get(url)

In [4]: response.request.headers
Out[4]: {'User-Agent': 'python-requests/2.18.4', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}

requests是HTTP訪問極其重要的庫,比較常用的屬性有:response.status_code、response.text。

更多參考資料:python工具庫介紹-requests:人性化的HTTP

用Requests和BeautifulSoup爬取博客標題

爬取 https://china-testing.github.io/ 首頁的博客標題,共10條.

  • 參考答案
# -*- coding: utf-8 -*-
# 討論釘釘免費羣21745728 qq羣144081101 567351477
# CreateDate: 2018-10-16

import requests
from bs4 import BeautifulSoup

def get_upcoming_events(url):
   req = requests.get(url)
   soup = BeautifulSoup(req.text, 'lxml')
   events = soup.findAll('article')

   for event in events:
       event_details = {}
       event_details['name'] = event.find('h1').find("a").text
       print(event_details)

get_upcoming_events('https://china-testing.github.io/')

執行結果:


$ python3 blogs.py
{'name': '接口自動化性能測試線上培訓大綱'}
{'name': '2018最佳人工智能圖像處理工具OpenCV書籍下載'}
{'name': 'IBM開發社區python精品文章彙總'}
{'name': 'python工具庫介紹-requests:人性化的HTTP'}
{'name': '中草藥的故事-金銀花(標準中藥)- 清熱解毒,疏散風熱'}
{'name': '中草藥的故事-合歡花(標準中藥)'}
{'name': '中草藥的故事-吳茱萸(標準中藥)'}
{'name': '[雪峯磁針石博客]python3快速入門教程9重要的標準庫-高級篇'}
{'name': '[雪峯磁針石博客]python3快速入門教程11命令行自動化工具與pexpect'}
{'name': '[雪峯磁針石博客]python3快速入門教程9重要的標準庫-基礎篇'}

BeautifulSoup的默認解析器爲html.parser,處理大頁面比較吃力,爲此使用lxml。解釋器html5lib的行爲和瀏覽器表現類似。

最新代碼地址

https://github.com/china-testing/python-api-tesing/blob/master/python-automation-cook/ch3/blogs.py

selenium訪問'https://httpbin.org/forms/post'

用selenium訪問'https://httpbin.org/forms/post',填充內容

圖片.png

  • 參考答案
# 討論釘釘免費羣21745728 qq羣144081101 567351477
# CreateDate: 2018-10-16

from selenium import webdriver
import time

browser = webdriver.Chrome()
browser.get('https://httpbin.org/forms/post')
custname = browser.find_element_by_name("custname")
custname.clear()
custname.send_keys("python測試開發")

time.sleep(2)
for size_element in browser.find_elements_by_name("size"):  
   if size_element.get_attribute('value') == 'medium':
       size_element.click()
  
time.sleep(2)        
for topping in browser.find_elements_by_name('topping'):
   if topping.get_attribute('value') in ['bacon', 'cheese']:
       topping.click()
   
time.sleep(2)           
browser.find_element_by_tag_name('form').submit()

執行結果

{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "comments": "", 
    "custemail": "", 
    "custname": "python\u6d4b\u8bd5\u5f00\u53d1", 
    "custtel": "", 
    "delivery": "", 
    "size": "medium", 
    "topping": [
      "bacon", 
      "cheese"
    ]
  }, 
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8", 
    "Accept-Encoding": "gzip, deflate, br", 
    "Accept-Language": "zh-CN,zh;q=0.9", 
    "Cache-Control": "max-age=0", 
    "Connection": "close", 
    "Content-Length": "132", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "Origin": "https://httpbin.org", 
    "Referer": "https://httpbin.org/forms/post", 
    "Upgrade-Insecure-Requests": "1", 
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36"
  }, 
  "json": null, 
  "origin": "183.62.236.90", 
  "url": "https://httpbin.org/post"
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章