Robo_feed check loop routine if giving stick in python

3

Robo_Feed I am new to python, I was trying to develop a theft that took the last post from some news portals here in my region. With a search I came to the conclusion that it would be easier to do with the python feedparser module. So I'll try to explain the code ...

  • I made a class that took care of the data entries, site name and url, thus making a list of them
  • Define a routine in feed_parser () that extracts the data from the variables giving a print to min in the screen of the last posts
  • I put a while inside the try to call the functions The script runs smoothly in the first 2 loops after the stick
  • As I said I'm new and inexperienced, more eager to learn to program right in Python

    code

    import feedparser
    import time
    import base64
    timeaut = 15
    
    class Robo_Feed(object):
        def __init__(self):
            self.data = []
            self.sites= {}
            self.allheadlines = []
        def append_url(self, name, site):
            self.sites[name] = (site)
        def encrip(self, published):
            self.data.append(base64.b64encode(published))
    
    def feed_parser():
        def parseRSS(rss_url):
            return feedparser.parse(rss_url)
        def getHeadlines(rss_url):
            robo.headlines = []
            feed = parseRSS(rss_url)
            if (base64.b64encode(feed.entries[0].published)) in str(robo.data):
                return robo.headlines
            else:
                POSTAGEN = '*'+feed['feed']['title']+'*' +"\n" + feed.entries[0].title + "\n" + feed.entries[0].link + "\n\r"
                print (POSTAGEN)
                robo.encrip(feed.entries[0].published)
                return robo.headlines
        for key, url in robo.sites.items():
            robo.allheadlines.extend(getHeadlines(url))
    
    robo = Robo_Feed()
    robo.append_url('Oglobo', 'http://oglobo.globo.com/rss.xml?secao=ece_frontpage')
    robo.append_url('ZeroHora', 'http://zh.clicrbs.com.br/rs/ultimas-noticias-rss/')
    
    try:
        while True:
            feed_parser()
            time.sleep(time)
    except KeyboardInterrupt:
        print "Umteromped"
    

    Error

          File "/home/ubuntu/PycharmProjects/BotTelegran/testes.py", line 38, in <module>
        feed_parser()
      File "/home/ubuntu/PycharmProjects/BotTelegran/testes.py", line 30, in feed_parser
        robo.allheadlines.extend(getHeadlines(url))
      File "/home/ubuntu/PycharmProjects/BotTelegran/testes.py", line 22, in getHeadlines
        if (base64.b64encode(feed.entries[0].published)) in str(robo.data):
    IndexError: list index out of range
    
        
    asked by anonymous 16.01.2017 / 03:55

    1 answer

    1

    I was able to make the code run, the problem that was happening was in the encoding for base 64, which only works for ASCII characters, just convert to ASCII before converting.

    Another problem was in time.sleep(timeaut) that was calling module time instead of the variable with timeout.

    Follow the code that works.

    import feedparser
    import time
    import base64
    timeaut = 15
    
    class Robo_Feed(object):
        def __init__(self):
            self.data = []
            self.sites= {}
            self.allheadlines = []
    
        def append_url(self, name, site):
            self.sites[name] = (site)
    
        def encrip(self, published):
            self.data.append(base64.standard_b64encode(published.encode('ascii')))
    
    def feed_parser():
        def parseRSS(rss_url):
            return feedparser.parse(rss_url)
    
        def getHeadlines(rss_url):
            robo.headlines = []
            feed = parseRSS(rss_url)
    
            if (base64.standard_b64encode(feed.entries[0].published.encode('ascii')) in robo.data):
                return robo.headlines
            else:
                POSTAGEN = '*'+feed['feed']['title']+'*' +"\n" + feed.entries[0].title + "\n" + feed.entries[0].link + "\n\r"
                print (POSTAGEN)
                robo.encrip(feed.entries[0].published)
                return robo.headlines
    
        for key, url in robo.sites.items():
            robo.allheadlines.extend(getHeadlines(url))
    
    robo = Robo_Feed()
    robo.append_url('Oglobo', 'http://oglobo.globo.com/rss.xml?secao=ece_frontpage')
    robo.append_url('ZeroHora', 'http://zh.clicrbs.com.br/rs/ultimas-noticias-rss/')
    
    try:
        while True:
            feed_parser()
            time.sleep(timeaut)
    except KeyboardInterrupt:
        print ("Umteromped")
    
        
    29.01.2017 / 18:52