I have the following situation:
<a href="https://g1.globo.com">Globo</a>
<h3 class="b">
<a href="https://www.google.com">Google</a>
</h3>
Using BeautifulSoup, how do I get only the href and tag 'a' text inside the 'h3'?
I have the following situation:
<a href="https://g1.globo.com">Globo</a>
<h3 class="b">
<a href="https://www.google.com">Google</a>
</h3>
Using BeautifulSoup, how do I get only the href and tag 'a' text inside the 'h3'?
Just search for the h3
tag and then search for the a
element:
from bs4 import BeautifulSoup
data = """<a href="https://g1.globo.com">Globo</a>
<h3 class="b">
<a href="https://www.google.com">Google</a>
</h3>"""
soup = BeautifulSoup(data)
div = soup.find('h3', class_='b')
a = div.find('a')
print a['href']
print a.text