Expression to remove URL links from twitter tweet

2

I would like to know if anyone knows any expressions to remove links that are present in a .CSV file in the Python language.

Text Ex:

  

joao was on the market link I want this text to appear

With this I want the output to be:

  

John was in the market I want this text to appear

I have this code. It removes the link but also the text that comes later

URLless_string = re.sub(r'\w+:\/{2}[\d\w-]+(\.[\d\w-]+)*(?:(?:\/[^\s/]*))*', '', str(linha))
print(str(URLless_string))
    
asked by anonymous 26.06.2017 / 16:17

1 answer

2

Do this:

import re

linha = raw_input("Entre com o tweet: ")
URLless_string = re.sub(r"http\S+", "", str(linha))
print(str(URLless_string))
  • http home with literal characters
  • \S+ home with all non-blank characters (until the end of the url)
  • replace with empty string

Example online here .

    
26.06.2017 / 16:31