使用bs4的时候,偶然发现兄弟节点next_sibling加括号和不加括号还有区别,记录一下。
html代码:
<div class="col-md-4"> </div><div class="col-md-4"><strong>Total</strong></div><div class="col-md-4"><strong>Failed</strong></div>
python代码
for tag in soup.select('div .col-md-4'):
if tag.get_text() == 'Total':
result = tag.next_sibling.get_text()
print("next_sibing(): ", tag.next_sibling())
print(type(tag.next_sibling()))
print("next_sibling()[0].parent", tag.next_sibling()[0].parent)
print("next_sibling", tag.next_sibling)
print(type(tag.next_sibling))
print(tag.next_siblings)
print(result)
结果依次为:
"""看type为一个ResultSet,看源码就是一个储存tag对象的list"""
next_sibing(): [<strong>Failed</strong>]
<class 'bs4.element.ResultSet'>
"""如果是它的父节点,就等于不加括号的next_sibling了"""
next_sibling()[0].parent <div class="col-md-4"><strong>Failed</strong></div>
"""next_sibling是平常的使用方法,获取的是Tag对象"""
next_sibling <div class="col-md-4"><strong>Failed</strong></div>
<class 'bs4.element.Tag'>
"""next_siblings获取的是一个生成器"""
<generator object next_siblings at 0x7f682f404938>
Failed