I'm working with the The Guardian API, using two methods, search_content()
and data_to_csv()
, contained in the The Guardian class. The first method is responsible for searching the Guardian database according to the parameters provided, while the second consolidates the data collected in the search into a CSV file.
My question is as follows, the search_content()
method returns the json_content
variable, which is a dictionary containing the response packet coming from the search. However, I am not able to access the dictionary in the data_to_csv()
method. As can be seen below:
>>> from script_guardian import TheGuardian
>>> tg = TheGuardian('2016-01-01', '2018-01-01')
>>> json_content = tg.search_content('education', 'relevance', 'education')
>>> json_content
<bound method Content.get_content_response of <theguardian.theguardian_content.Content object at 0x7f7bb9764c88>>
>>> type(json_content)
<class 'method'>
That is, the return of method search_content()
is <class 'method'>
instead of Dict.
I believe this problem is due to the way I organized my methods. If at the moment of instantiating the object, I execute the data_to_csv()
method, I can update the csv file, like this:
tg.data_to_csv(search_content())
I would like to know how I can organize my code in order to execute the methods, at the moment of creating the object, only with the parameters 'data_initial' and 'data_final'. That is,
tg = TheGuardian('yyyy-mm-dd','yyyy-mm-dd')
I think this can be set in __init__
, but I do not know how.
Questions:
- How to execute methods automatically when creating object?
- How to get json_content in method
data_to_csv()
in Dict format, instead of the method itself?
My code:
from theguardian import theguardian_content
import csv
class TheGuardian(object):
'''Metodos para busca e conversao de dados na base do The Guardian'''
def __init__(self, data_inicial, data_final):
'''
Inicializacao da instancia
Args:
data_inicial(str): data no formato ISO 8601
data_final(str): data no formato ISO 8601
'''
self.data_inicial = data_inicial
self.data_final = data_final
self.data_to_csv(self.search_content())
def search_content(self, content='education', order_by='relevance',
section='education', api_key='test', page_size=10):
'''
Metodo responsavel por buscar na base de dados do The Guardian
Args:
content(str): as noticias serao relacionadas ao assunto informado
page_size(str): noticias retornadas por pagina
order_by(int): ordenacao das noticias, pode ser 'newest',
'relevance' e 'oldest'.
api_key(str): chave da API a ser utilizada
section(str): secao referente as noticias a serem retornadas
Returns:
json_content(dict): pacote de resposta da busca performada
'''
self.content = content
self.page_size = page_size
self.order_by = order_by
self.api_key = api_key
self.section = section
# Parametros de busca
params = {
'data_inicial': self.data_inicial,
'data_final': self.data_final,
'order-by': self.order_by,
'page-size': self.page_size,
'q': self.content,
'api': self.api_key,
'section': self.section
}
content = theguardian_content.Content(**params)
self.json_content = content.get_content_response
def data_to_csv(self, json_content):
'''
Conversao do pacote de resposta da busca em arquivo no formato CSV
Note:
Arquivo guardian_data.csv sera reescrito apos cada consulta na API
Args:
json_content(dict): conteudo retornado a partir dos parametros de
busca informados anteriormente
Returns:
guardian_data(csv): consolidacao dos dados consultado na API
'''
with open('guardian_data.csv', 'w') as csv_file:
writer = csv.writer(csv_file, delimiter=',')
# Escricao do cabecalho do arquivo CSV
writer.writerow(["webUrl", "webPublicationDate", "webTitle",
"sectionName", "apiUrl", "id", "isHosted",
"sectionId", "type", "pillarId", "pillarName"])
for result in self.json_content['response']['results']:
writer.writerow([
result["webUrl"],
result["webPublicationDate"],
result["webTitle"],
result["sectionName"],
result["apiUrl"],
result["id"],
result["isHosted"],
result["sectionId"],
result["type"],
result["pillarId"],
result["pillarName"]
])