Data extraction with Python and automatic sending of emails with the information obtained

1

Friends,

The data extraction part is working and the emailing is also partly done.

I would like the same information I print on the screen and with the same formatting (jumping line etc) to be sent as part of the email message. I would like to save the same one that was printed on the above screen and then send it by email as the message body of the email and not the attachment.

The idea would be to throw the information in the file list.txt and then copy it to the body of the email. The part of getting the same thing that was printed on the screen and playing as message body of email is what does not work. Could you help?

Another question: how to modularize the program below into 2 files, for example? One with the part of extracting information from the site and the other with the sending of email?

import os
import smtplib
from email import encoders
from email.mime.base import MIMEBase
from email.mime.multipart import MIMEMultipart
###########################################################

import requests, time
from bs4 import BeautifulSoup as bs
from datetime import datetime


url = "http://www.purebhakti.com/resources/vaisnava-calendar-mainmenu-71.html"

url_post = 'http://www.purebhakti.com/component/panjika'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'}
payload = {'action': 2, 'timezone': 23, 'location': 'Rio de Janeiro, Brazil        043W15 22S54     -3.00'}

req = requests.post(url_post, headers=headers, data=payload)
soup = bs(req.text, 'html.parser')
eles = soup.select('tr td')
dates = (' '.join(d.select('b')[0].text.strip().split()) for d in eles if d.has_attr('class'))
#events = (' '.join(d.text.split()) for d in eles if not d.has_attr('class'))
events = ((d.text) for d in eles if not d.has_attr('class'))
calendar = dict(zip(dates, events))

#data_hoje = time.strftime("%d %b %Y", time.gmtime() ) #data de hoje
data_desejada=time.strftime("%d %b %Y", time.gmtime(time.time() + (3600 * 24 * 2))) # daqui a 2 dias
print ("Prezados devotos, ")
print()
print("No dia %s, teremos o(s) seguinte(s) evento(s) no Calendario Vaisnava: " %(data_desejada))
print()
if(data_desejada in calendar):
    print(calendar[data_desejada],end = "" )
else:
    print('nenhum evento para hoje')
print()
print("Para mais detalhes acessem: %s " %(url))
print()
print("Jay Radhe!")



# esta parte nao funciona
#Gostaria de gravar o  mesmo que foi impresso em tela acima e depois enviar #por email como mensagem e não anexo

##arq = open('/home/gopala/Desktop/lista.txt', 'w')
##texto = """
##Prezados devotos,
##
##No dia %s, teremos o(s) seguinte(s) evento(s) no Calendario Vaisnava:  %(data_desejada))
##
##"""
##arq.write(texto)
##
##arq.close()
##



####parte envio email
COMMASPACE = ', '

def main():
    sender = '[email protected]'
    gmail_password = 'senhalegal'
    recipients = ['[email protected]']

    # Create the enclosing (outer) message
    outer = MIMEMultipart()
    outer['Subject'] = 'data no calendario Vaisnava'
    outer['To'] = COMMASPACE.join(recipients)
    outer['From'] = sender
    outer.preamble = 'You will not see this in a MIME-aware mail reader.\n'

    # List of attachments
    attachments = ['/home/gopala/Desktop/16839680_10212563027937627_634163502_n.jpg','/home/gopala/Desktop/lista.txt']

    # Add the attachments to the message
    for file in attachments:
        try:
            with open(file, 'rb') as fp:
                msg = MIMEBase('application', "octet-stream")
                msg.set_payload(fp.read())
            encoders.encode_base64(msg)
            msg.add_header('Content-Disposition', 'attachment', filename=os.path.basename(file))
            outer.attach(msg)
        except:
            print("Unable to open one of the attachments. Error: ", sys.exc_info()[0])
            raise

    composed = outer.as_string()

    # Send the email
    try:
        with smtplib.SMTP('smtp.gmail.com', 587) as s:
            s.ehlo()
            s.starttls()
            s.ehlo()
            s.login(sender, gmail_password)
            s.sendmail(sender, recipients, composed)
            s.close()
        print("Email sent!")
    except:
        print("Unable to send the email. Error: ", sys.exc_info()[0])
        raise

if __name__ == '__main__':
    main()
    
asked by anonymous 22.02.2017 / 21:50

1 answer

2

Answer 1: String formatting

You could use a formatting string and leave only the spaces for the variables along with the line breaks. For example.

string_envio = "Prezados devotos, "
string_envio += "\n"

if(data_desejada in calendar):
    string_envio += "No dia {}, teremos o(s) seguinte(s) evento(s) no Calendario Vaisnava: "
    string_envio += "\n"
    string_envio += calendar[data_desejada]
else:
    string_envio += "nenhum evento para hoje"

string_envio += "\n"
string_envio += "Para mais detalhes acessem: {}"
string_envio += "Jay Radhe!".format(url,data_desejada)

With this the string would already have all the line breaks and only the spaces with the variables.

Just passing as a message, without the need to create an attachment.

Response 2. Modularization

Modules : Search, Upload

  

Here you could use both function and classes; I'll opt for functions.

Email.py file:

conexao = conectar_email(login,senha) # retornando um objeto de login pronto para enviar emails
conexao.enviar(para=email_destino,titulo=titulo_da_mensagem,corpo=mensagem _formatada_anteriormente)

Leaving all connection complexity within functions.

File search.py

def request(url,options)

Options being a general dictionary with the headers that are passed and the function returning the html of the page requested.

def find(html,element_procurado)

Here you would pass all the html read by the request and also the desired element in there, returning the desired html parde.

def parse(element) 

Here all the scrapping would really be done, but only of the wanted element, since it would have gone through all the previous processing. And it would return some data structure you wanted, for example a list with the given ones, a dictionary with the key being the day and values the events .. then you decide.

Using this:

requisicao = request(sua_url,headers)
elemento = find(requisicao,'tr td')
conteudo = parse(elemento) # retornando a estrutura.

And finally throwing this content inside the string each in its place and passing it to the send function in the body parameter.

Deleting (at least now) the use of attachment libs.

    
28.02.2017 / 05:21