使用selenium模擬瀏覽器進行數據抓取+搜索指定關鍵詞+下拉滾動demo+selenium等待機制（Chrome Browser

原創

Dave_L

2020-06-28 04:49

0.安裝selenium + Chrome Driver

安裝selenium：
pip install selenium
安裝Chrome Driver：

下載：http://chromedriver.storage.googleapis.com/index.html
- 版本要對應（chrome://version查看版本）
將chromedriver.exe 添加到用戶環境變量

1.使用selenium模擬瀏覽器操作demo

from selenium import webdriver
import time

# 0.創建瀏覽器對象
browser = webdriver.Chrome()

# 1.訪問頁面
browser.get('http://www.baidu.com')

print(browser.title)    # 百度一下，你就知道

# print(type(browser.find_element_by_class_name("s-top-left"))) # <class 'selenium.webdriver.remote.webelement.WebElement'>

# 2.定位網頁元素,可使用（ Selenium自帶API || 通過browser.page_source獲取網頁源碼再結合BeautifulSoup等解析工具定位元素)

browser.find_element_by_link_text("新聞").click() # 模擬鼠標點擊文本爲“新聞”的鏈接

print(browser.current_url)  # https://www.baidu.com/

print(browser.page_source)  # 獲取網頁源碼

time.sleep(5)
browser.quit()  # 瀏覽器關閉

效果：打開Chrome，點擊新聞,等待5s，瀏覽器關閉

2.selenium模擬在豆瓣中搜索框中搜索指定關鍵詞

from selenium import webdriver
import time

browser = webdriver.Chrome()
browser.get('http://douban.com')
time.sleep(1)

search_box = browser.find_element_by_name('q')	# 搜索框
search_box.send_keys('Python')

bt1 = browser.find_element_by_class_name('bn')	#提交按鈕
bt1.click()

time.sleep(5)
browser.quit()

3.selenium模擬頁面下拉滾動

主要有兩種方式：

模擬鍵盤輸入（如輸入PageDown）

from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.common.keys import Keys
import time

browser = webdriver.Chrome()
browser.get('http://news.baidu.com/')

for i in range(20):
    # 模擬鍵盤輸入方式向下滾動,perform()執行ActionChains存儲的所有動作
    # 鏈式模型：ActionChains(browser).send_keys(x).click(y).move_to_element(z).perform()
    # 相當於執行三個動作（ t = ActionChains(browser) , t.send_keys(x) , t.click(y)
    # t.move_to_element(z) , t.perform()
    ActionChains(browser).send_keys(Keys.PAGE_DOWN).perform()

    time.sleep(0.5)

browser.quit()

執行JavaScript代碼

from selenium import webdriver
import time

browser = webdriver.Chrome()
browser.get('http://news.baidu.com/')

for i in range(10):
    # scrollTo(x,y)方法:將內容滾動到指定座標
    browser.execute_script("window.scrollTo(0,document.body.scrollHeight)") # 滾動到頁面底部
    time.sleep(0.5)

4.selenium等待機制

使用Selenium時也要注意等待機制，以保證瀏覽器被驅動時能夠有尋找元素的緩衝時間。
分爲隱式等待，顯式等待

隱式等待可以直接通過瀏覽器驅動對象implicitly_wait(x)調用（不靈活、寫法簡單

browser = webdriver.Chrome()

browser.implicitly_wait(x)     # 若下面的find_element_by_id()未能立即獲得結果，則保持輪詢並等待x秒
browser.get('a url')
browser.find_element_by_id('id_name')

顯式等待通過結合WebDriverWait 和 Expected Condition使用：（等待某一條件發生，較靈活

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By

browser = webdriver.Chrome()
browser.get('a url')

WebDriverWait(browser,10,0.5).until(EC.presence_of_element_located((By.LINK_TEXT, 'text'))).click()

其中，
WebDriverWait(driver=browser, timeout=10, poll_frequency=0.5, ignored_exceptions=None):

driver：瀏覽器驅動
timeout：最長超時時間，單位（秒）
poll_frequency：輪詢檢測時間，也就是每隔多少時間檢測一次，默認是0.5秒
ignored_exceptions：超時後的異常信息，默認拋出NoSuchElementException

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

使用selenium模擬瀏覽器進行數據抓取+搜索指定關鍵詞+下拉滾動demo+selenium等待機制（Chrome Browser

0.安裝selenium + Chrome Driver

1.使用selenium模擬瀏覽器操作demo

2.selenium模擬在豆瓣中搜索框中搜索指定關鍵詞

3.selenium模擬頁面下拉滾動

4.selenium等待機制

AI 畫圖真刺激，手把手教你如何用 ComfyUI 來畫出刺激的圖

公司剛入職了一名 Java 中級開發，短短 4 行代碼居然湊齊了 3 個 bug！我哭了~~

數據展示動態（跑分）顯示

公衆號5月C#/.NET熱文一覽

git 下載大陸鏡像地址

python crawler - 使用代理增加博客文章訪問量

python crawler - Session模擬表單登陸並下載登錄後用戶頭像demo

python - matplotlib demo

Pandas - Series、DataFrame、plot（demo

python-對docx文檔操作demo + word批量轉pdf 及[AttributeError]解決方案

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結