How can I read a file in python by looking for words that fit into a regular expression, eg searching for dates (dd / mm / yyyy)?
How can I read a file in python by looking for words that fit into a regular expression, eg searching for dates (dd / mm / yyyy)?
You can do this by implementing a generator that makes use of the re
library searching on each line of the file all words that satisfy the pattern defined by the expression. An outline of this function would be:
def get_pattern_from_file(filename, expression):
pattern = re.compile(expression)
with open(filename) as stream:
for line in stream:
yield re.findall(pattern, line)
See working at Repl.it
So you can iterate over the function return and get a list of all the words that satisfy the expression per line. You can even convert the result to just one list, with all words, using the itertools.chain
.