CentOS7整合selenium使用google-chrome的headless

第一部分:安裝chrome

1 源配置

cd /etc/yum.repos.d/
vi google-chrome.repo

將下面內容寫入文件

[google-chrome]
name=google-chrome
baseurl=http://dl.google.com/linux/chrome/rpm/stable/$basearch
enabled=1
gpgcheck=1
gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub

2.安裝

yum -y install google-chrome-stable --nogpgcheck

3.驗證

[root@localhost yum.repos.d]# google-chrome -version
Google Chrome 64.0.3282.140

第二部分:配置自動化環境與運行

1.下載chromedriver

http://npm.taobao.org/mirrors/chromedriver/
wget http://npm.taobao.org/mirrors/chromedriver/80.0.3987.106/chromedriver_linux64.zip

找到與google-chrome對應版本

第三部分: 裝selenium(自行安裝)

pip install selenium

第四部分 實例1:

百度,browser-baidu.py

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage') 

browser =  webdriver.Chrome(executable_path='/opt/google/chromedriver/chromedriver', chrome_options=chrome_options)

try:
    browser.get('https://www.baidu.com')
    input = browser.find_element_by_id('kw')
    input.send_keys('Python')
    input.send_keys(Keys.ENTER)
    wait = WebDriverWait(browser, 10)
    wait.until(EC.presence_of_element_located((By.ID, 'content_left')))
    print(browser.current_url)
    print(browser.get_cookies())
    print(browser.page_source)
finally:
    browser.close

運行 python browser-baidu.py


第四部分 實例2:

京東商品,browser-jd.py

import sys  
  
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait

reload(sys)
sys.setdefaultencoding('utf8')


chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage') 

browser =  webdriver.Chrome(executable_path='/opt/google/chromedriver/chromedriver', chrome_options=chrome_options)

try:
    browser.get('https://item.jd.com/100006188596.html')
    wait = WebDriverWait(browser, 30)
    print(browser.current_url)
    print(browser.page_source)
    #print(browser.execute_script("return document.documentElement.outerHTML"))
finally:
    browser.close

運行 python browser-jd.py

</div><script src="//gias.jd.com/js/td.js"></script><script src="https://gia.jd.com/y.html?v=0.14113323635067587&amp;o=item.jd.com/100006188596.html"></script><div id="userdata_el" style="visibility: hidden; position: absolute;"></div></body></html>

包含動態內容-價格:169.00

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章