Is there a more efficient way to dynamically create an array from another array by filtering the contents of the array first?

13

I have an array of values that can include several numpy.nan:

import numpy as np
a = np.array ( [1, 2, np.nan, 4] )

And I want to iterate over your items to create a new array without np.nan.

The way I know to create arrays dynamically is to create an array of zeros ( np.zeros() ) and fill it with content of interest a posteriori .

The way I do, I have to iterate the array a twice: one to count how many np.nan s I'm going to find and reduce that number of the array size b ; and the second iteration to populate the array b :

# Contando quantos nan's
count = 0
for e in a:
if np.isnan(e):
    count += 1

# criando o array vazio do tamanho certo
size = a.shape[0]
b = np.zeros( (size - count, ) )

# populando o array com o conteúdo pertinente
ind = 0
for e in a:
    if not np.isnan(e):
        b[ind] = e
        ind += 1

I imagine you can also do this by converting a to list (since it is one-dimensional) and filter that list to the b list by converting it to array.

But is there a more efficient way to do this only with arrays?

    
asked by anonymous 15.12.2013 / 17:48

3 answers

14

You can filter the values by using an expression in the index:

import numpy as np
a = np.array ( [1, 2, np.nan, 4] )

# Filtra NaN
filtrado = a[~np.isnan(a)]

The expression np.isnan(a) returns a boolean vector indicating, for each position of the array a , whether or not it is NaN . ~ negates this vector. Then you use the Boolean indexing mechanism. to select only the records whose ~np.isnan(a) value is True .

    
15.12.2013 / 17:56
1

I think the default solution to your problem is to use the filter function whose syntax is:

filter(função_booleana, valor_interavel)

For each value in valor_interavel , the function executes função_booleana with the value, filtering it from the result if the_boolean_function returns false. You can use it in conjunction with isnan like this:

filter(np.isnan, seu_array)

Best of all, the solution stays compact and clear. Note that you do not need to import any modules to get the filter function, since it is implemented by the Python interpreter.

    
28.01.2014 / 19:57
0

You can use the set of python it removes the repeated items from a list.

>>> a = [1,2,3]
>>> b = a + [4,5,6,3,2,1]
>>> print b
[1, 2, 3, 4, 5, 6, 3, 2, 1]
>>> print set(b)
set([1, 2, 3, 4, 5, 6])

I think this solves your problem!

    
19.01.2014 / 02:27