爬虫基础之HTML元素选择器: Xpath

Xpath元素选择器

例：
//div[contains(@class, "Rating rating-")] : 获取所有class属性包含Rating rating-的元素集合
//div[@id="properties"]//td/span[contains(text(), "面料")]: 获取指定路径下 span 标签的文本内容 包含 面料 的元素

XPath使用方法

绝对路径定位元素（不推荐！）html/body/div/form/input
相对路径定位元素 //input
索引定位元素 //input[4]
属性值定位 //input[@id='fuck']
多属性值组合定位 //input[@type='submit'][@name='fuck']
条件判断 //input[@type='submit' and @name='fuck']
起始符匹配 //input[start-with(@id,'fuck')]
结束符匹配 //input[ends-with(@id,'fuck')]
包含匹配 //input[contains(@id,'fuck')]
标签内文字匹配 //li[text()="1-3-3-1"]
元素属性 //input[@type] : 所有包含type属性的input标签元素
所有属性值匹配 //input[@*='fuck']
子元素 a/b
后代元素 a//b
父级元素 li[@id="test"]/../li[1] : 父级元素的第一个li子元素
父级元素 li[@id="test"]/parent::div : 父级div元素
所有元素 /*

生意参谋 - 品类 - 商品360 - 商品详情。每一个功能选项卡都由单独的 span 标签包裹。
销售分析选项卡，在第二个 span 标签内

//*[@id="content-container"]/div[2]/div/div/div[1]/div/div/div/div/div/div[2]/div[1]/div[1]/span/span[2]/span

//*[@id="content-container"]/div[2]/div/div/div[1]/div/div/div/div/div/div[2]/div[1]/div[1]/span/span[contains(string(),"销售分析")]
//*[@id="content-container"]/div[2]/div/div/div[1]/div/div/div/div/div/div[2]/div[1]/div[1]/span/span/span[contains(text(),"销售分析")]

css元素选择器

cssSelector也是一种常用的选择器，CSS locator比XPath locator速度快,能非常精准的定位到想测试的Elements

注意事项

css没有：contains('xx') 这样的用法

https://stackoverflow.com/questions/14595149/alternative-of-contains-in-cssselector-selenium-webdriver

用法

# 表示id
. 表示class
> 表示子元素，层级
空格：div p 选择
元素内的所有
元素

一个空格也表示一个子元素，但是所有的子元素相当于xpath中的相对路径

    #input 选择id为input的节点
    
    .Volvo 选择class为Volvo的节点
    
    div#radio>input 选择id为radio的div下的所有的input节点
    
    div#radio input 选择id为radio的div下的所有的子孙后代input节点
    
    div#radio>input:nth-of-type(4) 选择id为radio的div下的第4个input节点
    
    div#radio>nth-child(1) 选择id为radio的div下的第1个子节点
    
    div#radio>input:nth-of-type(4)+label 选择id为radio的div下的第4个input节点之后挨着的label节点
    
    div#radio>input:nth-of-type(4)~labe 选择id为radio的div下的第4个input节点之后的所有label节点
    
    input.Vovlo[name='identity'] 选择class为.Volvo并且name为identity的input节点
    
    input[name='identity'][type='radio']:nth-of-type(1) 选择name为identity且type为radio的第1个input节点
    
    input[name^='ident'] 选择以ident开头的name属性的所有input节点
    
    input[name$='entity'] 选择以'entity'结尾的name属性的所有input节点
    
    input[name*='enti'] 选择包含'enti'的name属性的所有input节点
    
    div#radio>*.not(input) 选择id为radio的div的子节点中不为input的所有子节点
    
    input:not([type='radio']) 选择input节点中type不为radio的所有节点

frame

切换到 iframe

使用 WebElement

# 存储网页元素
# iframe = driver.find_element_by_tag_name("iframe")
iframe = driver.find_element(By.CSS_SELECTOR, "#modal > iframe")

# 切换到选择的 iframe
driver.switch_to.frame(iframe)

# 单击按钮
driver.find_element(By.TAG_NAME, 'button').click()

使用索引

# 切换到第 2 个框架
driver.switch_to.frame(1)

离开 iframe

# 切回到默认内容
driver.switch_to.default_content()

xpath元素 https://www.w3school.com.cn/xpath/xpath_functions.asp
selenium元素定位之css:contains的使用 https://blog.csdn.net/jiangsquall/article/details/9352727
Xpath选择器 https://zhuanlan.zhihu.com/p/384457307
xpath定位中详解id 、starts-with、contains、text()和last() 的用法 https://www.cnblogs.com/unknows/p/7684331.html
官网操控浏览器 https://www.selenium.dev/documentation/zh-cn/webdriver/browser_manipulation/
CSS选择器 https://www.runoob.com/cssref/css-selectors.html
selenium之如何使用cssSelector定位页面元素 https://www.cnblogs.com/liwu/p/5016716.html

爬虫基础之HTML元素选择器: Xpath

Xpath元素选择器

XPath使用方法

css元素选择器

注意事项

用法

frame

切换到 iframe

离开 iframe

相关推荐

评论抢沙发

文章评论已关闭！

Xpath元素选择器

XPath使用方法

css元素选择器

注意事项

用法

frame

切换到 iframe

离开 iframe

相关推荐

评论 抢沙发

文章评论已关闭！

评论抢沙发