Selenium笔记——可当做简略教程使用哦

Abstract

参见我的语雀，记得关注我哦:https://www.yuque.com/jhongtao/zr9a1x/msb6ct

Selenium with Python中文翻译文档:https://selenium-python-zh.readthedocs.io/en/latest/

测试教程网上的Selenium教程：http://www.testclass.net/selenium_python/

Github上的案例文档：https://github.com/xuyichenmo/selenium-document

当然还有我的Selenium实战啦：https://www.yuque.com/jhongtao/zr9a1x/fo15yt

创建浏览器对象driver

from selenium.webdriver import Firefox	# 从selenium中import firefox
from selenium.webdriver.firefox.options import Options	#导入 firefox.options

options = Options() # 创建Options对象
options.add_argument('-headless')   # 设置浏览的方式为无界面方式，这样可以加快爬虫速度
# 我把geckodriver.exe文件放在了E:\ProgramFiles\firefox\geckodriver，根据实际情况修改executable_path参数值
driver = Firefox(executable_path=in_k.executable_path, firefox_options=options) # 创建driver浏览器对象

driver对象常用方法和属性

driver.get(url)获取网页

driver.get(url)	#参数url为需要爬取的网页地址

driver.page_source获取网页源代码

printf(driver.page_source)	# 打印网页源代码

获取标签中的文本内容

driver.find_element_by_id("id_name").text	# 通过网页元素的id获取标签中的文本内容

driver.title获取网页的标题

print(driver.title)	#打印网页的标题

driver.save_screenshout(“img_name.img”)保存当前网页为图片

driver.save_screenshout("img_name.png")	# 将当前网页的页面效果保存为png，img_name.png为图片文件的名称

button.click()模你鼠标单击事件

button = driver.find_element_by_id("id_name")	# 获取id为id_name的按钮
button.click()	# 模你鼠标单击事件

input.send_keys(u “keyword”)给输入框添加内容

input = driver.find_element_by_id("id_name")	# 获取页面中id为in_name的元素
keyword = "百度”	# 设置需要输入的内容
input.send_keys(u keyword)	# 给input输入框添加输入内容keyword

element.send(Keys.value, ‘letter’)模你键盘操作

# 首先要引入Keys包
from selenium.webdriver.common import keys

# 模你Ctrl+A全选文本输入框的内容
input = driver.find_element_by_id("id_name")	# 获取页面中id为in_name的元素
input.send_keys(Keys.CONTROL,'a')	# Ctrl+A操作

# 模你Ctrl+X剪切操作
input.send_keys(Keys.CONTROL,'x')	#Ctrl+X操作

# 模你Ctrl+Enter操作
keyword = "百度”	# 设置需要输入的内容
input.send_keys(u keyword)	# 给input输入框添加输入内容keyword
input.send_keys(Keys.RETURN)	# Keys.RETURN 模你Enter操作

input.clear()清除输入框内容

input = driver.find_element_by_id("id_name")	# 获取页面中id为in_name的元素
input.clear()	# 清除输入框内容

driver.get_cookies()获取当前页面的Cookie

print(driver.get_cookies())	#获取当前页面的Cooke

driver.current_url获取当前页面的网址

print(driver.current_url)	# 打印当前网页的地址

driver.close()关闭当前页面

# 关闭当前页面，当获得页面之后，需要关闭页面，以减少内存开销
# 如果当前浏览器只打开了一个页面，会同时关闭到浏览器，也就是释放掉driver对象
driver.close()

driver.quit()关闭浏览器

driver.quit()	#当浏览器使用完毕后要记得关闭浏览器

driver对象的方法和属性详情

Element元素获取

获取方式的两种形式

1.直接通过元素的属性值获取

2.通过By的方式获取：必须导入from selenium.webdriver.common.by import By

# 通过id获取页面元素
<div id = "id_name"></div>
div = driver.find_element_by_id("id_name")
#By的方式
from  selenium.webdriver.common.by import By
div = driver.find_element(by = By.ID,value = "id_name")	#方式一
div = driver.find_element(By.ID,"id_name")	#形式二

# 通过name标签获取元素
<input name = "name" type = "text"/>
input = driver.find_element_by_name("name")

# 通过标签名获取元素
<iframe src "#"></iframe>
iframe = driver.find_element_by_tage_name("iframe")

# 通过XPanth来获取页面元素
<input type = "text" name = "example"/>
<INPUT type = "text" name = "other"/>
inputs = driver.find_element_by_xpath("//input")

# 通过链接文本获取页面元素
<a href="#">百度</a>
baidu = drever.find_element_by_link_text("百度")

# 通过部分链接文本获取页面元素
<a href="#">baidu google sogou</a>
baidu = drever.find_element_by_link_text("google")

# 通过css样式名称来获取页面元素
# 类似于使用css的选择器
<div id = "food">
	<span class = "dairy">milk</span>
    <span class = "dairy aged">cheese</span>
</div>
cheese = driver.find_element_by_css_selector("#food span.daiiri.aged")

ActionChains类实现鼠标操作

# 非常重要使用之前必须先引入ActionChains类
from selenium.webdriver import ActionChains

# 鼠标移动到指定的元素
# 1.获取指定的元素
button = driver.find_element_by_id("id_botton")	#获取id为id_name的按钮元素
# 实现鼠标移动到button按钮上
# 传参：第一个参数为driver对象，第二个参数为元素对象
ActionChains(driver).move_to_element(button).perform()	

# 在元素位置单击
# 参数说明：第一个参数为driver对象，第二个参数和第三个参数都是需要点击的元素对象
# 单击操作逻辑：首先需要先获取到元素，然后移动到元素上，最后在该元素上单击
ActionChains(driver).move_to_element(button).click(button).perform()

# 在元素位置单击并保持按住,与单击类似
ActionChains(driver).move_to_element(button).click_and_hold(button).perform()

# 在元素位置双击,与单击类似
ActionChains(driver).move_to_element(button).double_click(button).perform()

# 在元素位置右击,与单击类似
ActionChains(driver).move_to_element(button).context_click(button).perform()

# 将元素A移动到元素B的位置
e_a = driver.find_element_by_id("A")
e_b = driver.find_element_by_id("B")
ActionChains(driver).drag_and_drop(e_a,e_b).perform()

Select类实现表单的填充

<select id = "status" >
	<option value = "0">北京</option>
	<option value = "1">上海</option>
	<option value = "2">深圳</option>
</select>

# 导入Select类
from selenium.webdriver.support.ui import Select

# 操作下拉列表
# 1.获取下拉列表元素
select_element = driver.find_element_by_id("status")

# 2.创建Select对象，传入初始化参数，也就是下拉列表元素
select = Select(select_element)

# 选择下拉列表框的选项
select.select_by_index(0)	  	  # 根据索引选择,index的值从0开始
select.select_by_value("1")		  #根据value值选择
select.select_by_visible_text(u "深圳")	# 根据文字内容选择

# 取消选择
select.deselect_all()

switch_to_alert()方法处理弹窗

alert = driver.switch_to_alert()	#获取页面弹窗

浏览器页面切换

# 方法一：
driver.switch_to_window("window_name")	# 参数说明：window_name为窗口的名称

# 方法二：
# 通过使用window_handles()方法来获取每个窗口的操作对象
for handle in driver.window_handles:
    driver.switch_to_window(handle)

页面的前进和后退

# forward()实现前进操作
driver.forward()

# back()实现后退操作
driver.back()

获取页面Cookies

# 获取页面Cookie
for cookie in driver.get_cookies():
    prin(cookie['name'])
    
# 删除Cookies
# 1.根据Cookies名称删除cookie
driver。delete_cookie("cookie_name")	#参数说明：cookie_name为cookie的名称

# 2.删除页面上的所有cookie
driver.delete_all_cookies()

页面等待

隐式页面等待

WebDriverWait类实现显示等待

显示等待指定某个条件，然后设置最长等待时间，如果这个时间结束时还没有找到元素，就会抛出异常

WebDriverWait(driver,timeout,poll_frequency=0.5,ignored_exceptions = None)
# 参数说明
# driver：WebDriver的浏览器驱动程序
# timeout:最长超时时间，默认以秒为单位
# poll_frequency:休眠时间的间隔（步长），默认为0.5s
# ignored_exceptions:超时后的异常信息，默认情况下抛出NoSuchElementException异常
# WebDriverWait对象一般与until()或者until_not()方法配合使用：
	使用这两个函数的时候需要引入：expected_conditions
		from selenium.webdriver.support import expected_conditions as EC
    通过expected_conditions提供的参数程序为until()或者until_not()提供参数
	until(EC.VALUE):调用该方法提供的驱动程序作为一个参数，直到返回值不为False
    until_not(EC.VALUE):调用该方法的驱动程序作为一个参数，直到返回值为False
    驱动程序参数列表：
    
# 实例
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
from  selenium.webdriver.common.by import By
# WebDriverWait库，负责循环等待
from  selenium.webdriver.support.ui import WebDriverWait
# expected_conditions 负责条件出发
from selenium.webdriver.support import expected_conditions as EC

executable_path = "E:/ProgramFiles/firefox/geckodriver/geckodriver.exe"
url = "https://www.baidu.com/"
options = Options()  # 创建Options对象
options.add_argument('-headless')  # 设置浏览的方式为无界面方式，这样可以加快爬虫速度
# 我把geckodriver.exe文件放在了E:\ProgramFiles\firefox\geckodriver，根据实际情况修改executable_path参数值
driver = Firefox(executable_path=executable_path, firefox_options=options)  # 创建driver浏览器对象
driver.get(url)
try:
    # 查找页面输入框 id = 'kw',知道找到了才返回，如果在10s内都没有找到，则抛出异常
    input = WebDriverWait(driver,10).until(EC.presence_of_element_located((By.ID,'kw')))
finally:
    driver.quit()

expected_conditions 驱动程序参数列表

class selenium.webdriver.support.expected_conditions.alert_is_present1

alert是否出现。

class selenium.webdriver.support.expected_conditions.element_located_selection_state_to_be(locator, is_selected)1

定位一个元素检查的它的状态是否和期待的一样。locator是定位器，是(by,path)的元组，is_selected是布尔值。
class selenium.webdriver.support.expected_conditions.element_located_to_be_selected(locator)
定位的元素是否被选择。

class selenium.webdriver.support.expected_conditions.element_selection_state_to_be(element, is_selected)1

定位一个元素检查的它的状态是否和期待的一样。element是网页元素，is_selected是布尔值。

class selenium.webdriver.support.expected_conditions.element_to_be_clickable(locator)1

元素是否是可见的或者是否是有效的，比如可以点击。locator是定位器。

class selenium.webdriver.support.expected_conditions.element_to_be_selected(element)1

检查元素是否是被选中的。element是网页元素。

class selenium.webdriver.support.expected_conditions.frame_to_be_available_and_switch_to_it(locator)1

判断该frame是否可以switch进去，如果可以的话，返回True并且switch进去，否则返回False。locator是定位器。

class selenium.webdriver.support.expected_conditions.invisibility_of_element_located(locator)1

检查一个元素是否不可见或者在DOM中没出现。locator是定位器。

class selenium.webdriver.support.expected_conditions.new_window_is_opened(current_handles)1

新的窗口是否打开，current_handles是当前窗口的句柄。

class selenium.webdriver.support.expected_conditions.number_of_windows_to_be(num_windows)1

打开的窗口是否满足期待。

class selenium.webdriver.support.expected_conditions.presence_of_all_elements_located(locator)1

判断网页上是否存在至少一个定位的元素。locator是定位器。

class selenium.webdriver.support.expected_conditions.presence_of_element_located(locator)1

判断元素是否存在DOM中，并不代表一定可见。locator是定位器。

class selenium.webdriver.support.expected_conditions.staleness_of(element)1

等待元素不再依附于DOM，即从DOM中删除。如果在DOM中返回False，否则返回True。element是网页元素。

class selenium.webdriver.support.expected_conditions.text_to_be_present_in_element(locator, text_)1

检查元素中的文本内容是否存在指定的内容。locator是定位器。

class selenium.webdriver.support.expected_conditions.text_to_be_present_in_element_value(locator, text_)1

检查元素的value值中是否存在指定的内容。locator是定位器。

class selenium.webdriver.support.expected_conditions.title_contains(title)1

判断当前页面的title是否包含预期字符串。是就返回True，否则False。

class selenium.webdriver.support.expected_conditions.title_is(title)1

判断当前页面的title是否精确等于预期。是就返回True，否则False。

class selenium.webdriver.support.expected_conditions.visibility_of(element)1

检查元素是否在DOM中可见，可见的意思是不仅是显示出来了，而且还有大于0的宽和高。element是网页元素。

class selenium.webdriver.support.expected_conditions.visibility_of_any_elements_located(locator)1

检查网页上至少存在一个可见的指定元素。locator是定位器。

class selenium.webdriver.support.expected_conditions.visibility_of_element_located(locator)1

检查元素是否在DOM中可见，可见的意思是不仅是显示出来了，而且还有大于0的宽和高。locator是定位器。

expected_conditions案例

#coding=utf-8
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait

base_url = "http://www.baidu.com"
driver = webdriver.Firefox()
driver.implicitly_wait(5)
'''隐式等待和显示等待都存在时，超时时间取二者中较大的'''
locator = (By.ID,'kw')
driver.get(base_url)

WebDriverWait(driver,10).until(EC.title_is(u"百度一下，你就知道"))
'''判断title,返回布尔值'''

WebDriverWait(driver,10).until(EC.title_contains(u"百度一下"))
'''判断title，返回布尔值'''

WebDriverWait(driver,10).until(EC.presence_of_element_located((By.ID,'kw')))
'''判断某个元素是否被加到了dom树里，并不代表该元素一定可见，如果定位到就返回WebElement'''

WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.ID,'su')))
'''判断某个元素是否被添加到了dom里并且可见，可见代表元素可显示且宽和高都大于0'''

WebDriverWait(driver,10).until(EC.visibility_of(driver.find_element(by=By.ID,value='kw')))
'''判断元素是否可见，如果可见就返回这个元素'''

WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,'.mnav')))
'''判断是否至少有1个元素存在于dom树中，如果定位到就返回列表'''

WebDriverWait(driver,10).until(EC.visibility_of_any_elements_located((By.CSS_SELECTOR,'.mnav')))
'''判断是否至少有一个元素在页面中可见，如果定位到就返回列表'''

WebDriverWait(driver,10).until(EC.text_to_be_present_in_element((By.XPATH,"//*[@id='u1']/a[8]"),u'设置'))
'''判断指定的元素中是否包含了预期的字符串，返回布尔值'''

WebDriverWait(driver,10).until(EC.text_to_be_present_in_element_value((By.CSS_SELECTOR,'#su'),u'百度一下'))
'''判断指定元素的属性值中是否包含了预期的字符串，返回布尔值'''

#WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it(locator))
'''判断该frame是否可以switch进去，如果可以的话，返回True并且switch进去，否则返回False'''
#注意这里并没有一个frame可以切换进去

WebDriverWait(driver,10).until(EC.invisibility_of_element_located((By.CSS_SELECTOR,'#swfEveryCookieWrap')))
'''判断某个元素在是否存在于dom或不可见,如果可见返回False,不可见返回这个元素'''
#注意#swfEveryCookieWrap在此页面中是一个隐藏的元素

WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//*[@id='u1']/a[8]"))).click()
'''判断某个元素中是否可见并且是enable的，代表可点击'''
driver.find_element_by_xpath("//*[@id='wrapper']/div[6]/a[1]").click()
#WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//*[@id='wrapper']/div[6]/a[1]"))).click()

#WebDriverWait(driver,10).until(EC.staleness_of(driver.find_element(By.ID,'su')))
'''等待某个元素从dom树中移除'''
#这里没有找到合适的例子

WebDriverWait(driver,10).until(EC.element_to_be_selected(driver.find_element(By.XPATH,"//*[@id='nr']/option[1]")))
'''判断某个元素是否被选中了,一般用在下拉列表'''

WebDriverWait(driver,10).until(EC.element_selection_state_to_be(driver.find_element(By.XPATH,"//*[@id='nr']/option[1]"),True))
'''判断某个元素的选中状态是否符合预期'''

WebDriverWait(driver,10).until(EC.element_located_selection_state_to_be((By.XPATH,"//*[@id='nr']/option[1]"),True))
'''判断某个元素的选中状态是否符合预期'''
driver.find_element_by_xpath(".//*[@id='gxszButton']/a[1]").click()

instance = WebDriverWait(driver,10).until(EC.alert_is_present())
'''判断页面上是否存在alert,如果有就切换到alert并返回alert的内容'''
print instance.text
instance.accept()

driver.close()

Selenium笔记——可当做简略教程使用哦

Selenium笔记——可当做简略教程使用哦

Abstract

Selenium with Python中文翻译文档:https://selenium-python-zh.readthedocs.io/en/latest/

测试教程网上的Selenium教程：http://www.testclass.net/selenium_python/

Github上的案例文档：https://github.com/xuyichenmo/selenium-document

当然还有我的Selenium实战啦：https://www.yuque.com/jhongtao/zr9a1x/fo15yt

创建浏览器对象driver

driver对象常用方法和属性

driver.get(url)获取网页

driver.page_source获取网页源代码

获取标签中的文本内容

driver.title获取网页的标题

driver.save_screenshout(“img_name.img”)保存当前网页为图片

input.send_keys(u “keyword”)给输入框添加内容

element.send(Keys.value, ‘letter’)模你键盘操作

input.clear()清除输入框内容

driver.get_cookies()获取当前页面的Cookie

driver.current_url获取当前页面的网址

driver.close()关闭当前页面

driver.quit()关闭浏览器

driver对象的方法和属性详情

Element元素获取

获取方式的两种形式

1.直接通过元素的属性值获取

2.通过By的方式获取：必须导入**from **selenium.webdriver.common.by **import **By

ActionChains类 实现鼠标操作

Select类 实现表单的填充

switch_to_alert()方法 处理弹窗

浏览器页面切换

页面的前进和后退

获取页面Cookies

页面等待

隐式页面等待

WebDriverWait类实现显示等待

expected_conditions 驱动程序参数列表

expected_conditions案例

2.通过By的方式获取：必须导入from selenium.webdriver.common.by import By

ActionChains类实现鼠标操作

Select类实现表单的填充

switch_to_alert()方法处理弹窗