Response is somewhat outdated and lacks some additional details. I corrected what was most grotesque and the night I will rewrite it.
You can filter your list in 3 possible ways:
Using the filter
Using list comprehensions
Using generators
Function filter
Advance the answer, you can take advantage of the tools that the language gives you. Since you want to filter the values in a list, nothing better than using filter
Python:
>>> lista = [2, 3, 1, 5, 1, 7, 8, 8, 9, 15, 1, 1]
>>> lista = list(filter(lambda x: x != 1, lista))
>>> print(lista)
[2, 3, 5, 7, 8, 8, 9, 15]
Returns an iterable object containing the values of the second parameter whose execution of function
on it returns True
.
iterable filter(function, iterable)
That is, when we do:
filter(myFilter, lista)
all values will be returned a lista
generator in which myFilter(x)
, being x
the lista
element, return true. Since you want to filter values equal to 1, you only have the condition: x != 1
. The function lambda
was used due to the simplicity of the function , but nothing prevents it from doing:
>>> lista = [2, 3, 1, 5, 1, 7, 8, 8, 9, 15, 1, 1]
>>> def is_not_one (value):
... return value != 1
>>> lista = list(filter(is_not_one, lista))
>>> print(lista)
[2, 3, 5, 7, 8, 8, 9, 15]
The result is exactly the same.
When to use?
The filter
function is ideal for use when the filter logic is not so trivial (you can not type in a line of code).
Basic test :
import random, time, sys
SIZE = 1000000
# Gera-se dados aleatórios entre 0 e 1
lista = [random.randrange(0, 9) for _ in range(SIZE)]
# Filtra a lista através de compressão de listas
START_TIME = time.time()
# Filtra a listra através da função 'filter'
list_filter = filter(lambda i: i != 1, lista)
FILTER_TIME = time.time()
print("Execução em: ", FILTER_TIME - START_TIME)
print("Consumo em memória: ", sys.getsizeof(list_filter))
Output:
Execução em: 5.9604644775390625e-06
Consumo em memória: 56
List Comprehensions
How to answered by user Camilo Santos:
>>> lista = [2, 3, 1, 5, 1, 7, 8, 8, 9, 15, 1, 1]
>>> lista = [l for l in lista if l != 1]
>>> print(lista)
[2, 3, 5, 7, 8, 8, 9, 15]
When to use?
list comprehensions is ideal when the number of elements in the list is small and the filter logic is trivial (which can be written in one line). The new list will be stored completely in memory and can thus consume a lot of resources on the machine if it is too long.
Basic test:
import random, time, sys
SIZE = 1000000
# Gera-se dados aleatórios entre 0 e 1
lista = [random.randrange(0, 9) for _ in range(SIZE)]
# Filtra a lista através de compressão de listas
START_TIME = time.time()
list_comprehension = [i for i in lista if i != 1]
LIST_TIME = time.time()
print("Execução em: ", LIST_TIME - START_TIME)
print("Consumo em memória: ", sys.getsizeof(list_comprehension))
Output:
Execução em: 0.05048513412475586
Consumo em memória: 7731040
It consumes memory and runtime, so it should be used with caution.
Generator
Using the Python Generator Tool:
>>> def filter_by_generator(lista, value):
... for i in lista:
... if i != value: yield i
>>> lista = filter_by_generator(lista, 1)
>>> print(list(lista))
[2, 3, 5, 7, 8, 8, 9, 15]
When to use?
Generators should be used when the number of elements in the list is too large, no matter how complex the filter logic is (if it is simple, you can use filter
and reduce the code) .
Basic test:
import random, time, sys
SIZE = 1000000
# Gera-se dados aleatórios entre 0 e 1
lista = [random.randrange(0, 9) for _ in range(SIZE)]
START_TIME = time.time()
# Filtra a lista através de gerador
def filter_by_generator(lista, value):
for i in lista:
if i != value: yield i
lista_generator = filter_by_generator(lista, 1)
GENERATOR_TIME = time.time()
print("Execução em: ", GENERATOR_TIME - START_TIME)
print("Consumo em memória: ", sys.getsizeof(lista_generator))
Output:
Execução em: 4.291534423828125e-06
Consumo em memória: 88
Both execution time and memory space are much lower than other methods. This is because generator does not compute the entire list at once, but generates each list item in "real-time" when it is iterated.