I would like an example of how to get the headlines for the Olympics at link
Using BeautifulSoup.
I would like an example of how to get the headlines for the Olympics at link
Using BeautifulSoup.
The question is how to look at the returned html get request and identify what you want, in this case we want all <span>
that have class cd__headline-text
, I assume that with 'headlines' refers to this. You can do this:
from bs4 import BeautifulSoup as bs4
import requests as r
req = r.get('http://edition.cnn.com/sport/olympics')
soup = bs4(req.text, 'html.parser') # req.text = html retornado
manchetes_html = soup.findAll('span', {'class': 'cd__headline-text'}) # aqui vamos procurar no html por aquilo que eu disse acima, e teremos uma lista de todos os eles que correspondam a procura
manchetes = '' # nossa futura string the manchetes
for manchete in manchetes_html:
manchetes += '{}\n'.format(manchete.text)
print(manchetes)