博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
selenium 速查手册 python版
阅读量:7243 次
发布时间:2019-06-29

本文共 5613 字,大约阅读时间需要 18 分钟。

1.安装与配置

pip install selenium

基本使用selenium都是为了动态加载网页内容用于爬虫,所以一般也会用到phantomjs

mac下如果要配置phantomjs环境的话

echo $PATH

ln -s <phantomjs地址> <PATH中任一路径>

至于chromeDriver,配置方法类似,下载地址:

https://sites.google.com/a/chromium.org/chrom selenium import webdriver

2.代码样例

from selenium import webdriverfrom selenium.common.exceptions import TimeoutExceptionfrom selenium.webdriver.support.ui import WebDriverWait # available since 2.4.0from selenium.webdriver.support import expected_conditions as EC # available since 2.26.0# Create a new instance of the Firefox driverdriver = webdriver.Firefox()# go to the google home pagedriver.get("http://www.google.com")# the page is ajaxy so the title is originally this:print driver.title# find the element that's name attribute is q (the google search box)inputElement = driver.find_element_by_name("q")# type in the searchinputElement.send_keys("cheese!")# submit the form (although google automatically searches now without submitting)inputElement.submit()try:    # we have to wait for the page to refresh, the last thing that seems to be updated is the title    WebDriverWait(driver, 10).until(EC.title_contains("cheese!"))    # You should see "cheese! - Google Search"    print driver.titlefinally:    driver.quit()

3.api速查

3.1定位元素

3.1.1 通过id查找:

element = driver.find_element_by_id("coolestWidgetEvah")orfrom selenium.webdriver.common.by import Byelement = driver.find_element(by=By.ID, value="coolestWidgetEvah")

3.1.2 通过class查找

cheeses = driver.find_elements_by_class_name("cheese")orfrom selenium.webdriver.common.by import Bycheeses = driver.find_elements(By.CLASS_NAME, "cheese")

3.1.3 通过标签名称查找

target_div = driver.find_element_by_tag_name("div")orfrom selenium.webdriver.common.by import Bytarget_div = driver.find_element(By.TAG_NAME, "div")

3.1.4 通过name属性查找

btn = driver.find_element_by_name("input_btn")orfrom selenium.webdriver.common.by import Bybtn = driver.find_element(By.NAME, "input_btn")

3.1.5 通过链接的内容查找

next_page = driver.find_element_by_link_text("下一页")orfrom selenium.webdriver.common.by import Bynext_page = driver.find_element(By.LINK_TEXT, "下一页")

3.1.6 通过链接的部分内容查找

next_page = driver.find_element_by_partial_link_text("去下一页")orfrom selenium.webdriver.common.by import Bynext_page = driver.find_element(By.PARTIAL_LINK_TEXT, "下一页")

3.1.7 通过css查找

cheese = driver.find_element_by_css_selector("#food span.dairy.aged")orfrom selenium.webdriver.common.by import Bycheese = driver.find_element(By.CSS_SELECTOR, "#food span.dairy.aged")

3.1.8 通过xpath查找

inputs = driver.find_elements_by_xpath("//input")orfrom selenium.webdriver.common.by import Byinputs = driver.find_elements(By.XPATH, "//input")

3.1.9 通过js查找

labels = driver.find_elements_by_tag_name("label")inputs = driver.execute_script(    "var labels = arguments[0], inputs = []; for (var i=0; i < labels.length; i++){
" + "inputs.push(document.getElementById(labels[i].getAttribute('for'))); } return inputs;", labels)

3.2 获取元素的文本信息

element = driver.find_element_by_id("element_id")element.text

3.3 修改userAgent

profile = webdriver.FirefoxProfile()profile.set_preference("general.useragent.override", "some UA string")driver = webdriver.Firefox(profile)

3.4 cookies 

# Go to the correct domaindriver.get("http://www.example.com")# Now set the cookie. Here's one for the entire domain# the cookie name here is 'key' and its value is 'value'driver.add_cookie({
'name':'key', 'value':'value', 'path':'/'})# additional keys that can be passed in are:# 'domain' -> String,# 'secure' -> Boolean,# 'expiry' -> Milliseconds since the Epoch it should expire.# And now output all the available cookies for the current URLfor cookie in driver.get_cookies(): print "%s -> %s" % (cookie['name'], cookie['value'])# You can delete cookies in 2 ways# By namedriver.delete_cookie("CookieName")# Or all of themdriver.delete_all_cookies()

最后放一个自己的代码样例好了,完成的功能为找到搜索框输入搜索关键词然后点击搜索按钮,然后打开每个搜索结果并且输出网页源代码

# coding=utf-8import timefrom selenium import webdriverfrom selenium.common.exceptions import TimeoutExceptionfrom selenium.webdriver.support.ui import WebDriverWait # available since 2.4.0from selenium.webdriver.support import expected_conditions as EC # available since 2.26.0# Create a new instance of the Firefox driverdriver = webdriver.Chrome()# go to the home pagedriver.get("http://www.zjcredit.gov.cn")#获得当前窗口句柄nowhandle = driver.current_window_handleprint driver.title# find the element that's name attribute is qymc (the search box)inputElement = driver.find_element_by_name("qymc")print inputElement# type in the searchinputElement.send_keys(u"同花顺")driver.find_element_by_name("imageField").click();# submit the form (compare with google we can found that the search is not a standard form and can not be submitted, we do click instead)# inputElement.submit()try:    # overlap will happen if we do not move the page to the bottom    # the last link will be under another unrelevant link if we do not scroll to the bottom    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")    #find all link and click them    for item in driver.find_elements_by_xpath('//*[@id="pagetest2"]/div/table/tbody/tr/td/a'):        item.click()        time.sleep(10)    #获取所有窗口句柄    allhandles=driver.window_handles    #在所有窗口中查找新开的窗口    for handle in allhandles:        if handle!=nowhandle:            #这两步是在弹出窗口中进行的操作,证明我们确实进入了            driver.switch_to_window(handle)            print driver.page_source        #返回到主窗口页面           driver.switch_to_window(nowhandle)finally:    driver.quit()

 添加一个阅读材料好了,写的挺好的

http://www.cnblogs.com/tobecrazy/p/4570494.html

转载于:https://www.cnblogs.com/vissac/p/5500803.html

你可能感兴趣的文章
选择排序模板
查看>>
Ubuntu12.04 Opencv2.4.8 安装笔记
查看>>
[NodeJS] 优缺点及适用场景
查看>>
谈一谈你对js线程的理解
查看>>
div+css 怎么让一个小div在另一个大div里面 垂直居中
查看>>
poj3280(区间dp)
查看>>
DB2创建表、操作表等常用命令
查看>>
Hadoop-2.4.0分布式安装手册
查看>>
二维数组转换成一维数组
查看>>
easyui datagrid 点击表头的事件
查看>>
Web 应用程序项目 Himall.Web 已配置为使用 IIS。 无法访问 IIS 元数据库
查看>>
软件工程人才需求现状与发展现状分析
查看>>
bootstrap插件的一些常用属性介绍
查看>>
MySQL 5.5.35 单机多实例配置详解
查看>>
API 3个 js对象
查看>>
重温数据结构-线性表(王德仙)2012-04-07
查看>>
Java面试官最常问的volatile关键字
查看>>
自动化测试笔记
查看>>
UVA10018 Reverse and Add
查看>>
NUC1178 Kickdown
查看>>