Adding points in a range of known data

0

I'm working on data analysis using Python and for this I'm training algorithms like SVC and K-means. The data used for the training have a fixed spacing between each sample, since they are sampled by an oscilloscope in fixed periods of time, in contrast, I have data obtained also by simulation, that for performance issues, they have varied spacing between the samples and a smaller number of points, which makes it difficult to use these two sources of data in the same analysis. Is there a method using numpy or pandas to perform this preprocessing of my simulation data?

An Ex of what should be done:

Simulation Array = [1.0, 2.0, 3.0]

Array Processed = [1.0, 1.5, 2.0, 2.5, 3.0]

    
asked by anonymous 19.02.2018 / 20:05

2 answers

1

I fully agree with comment about the data change that you will give as input. But tb know that we do not always have the data as we need it. It would be more likely to do a downsampling of one of the data, reducing errors due to extra data due to interpolation.

Once the warning , what you want is an interpolation!

To do this, use the interp function of numpy

import numpy as np
x = np.linspace(0, 2*np.pi, 10)
y = np.sin(x)
xvals = np.linspace(0, 2*np.pi, 50)
yinterp = np.interp(xvals, x, y)

This example is the same as in the manual, which explains tb boundary conditions, which may be relevant as you are applying.

np.interp also works for downsampling .

    
21.02.2018 / 15:35
1

Following your tip, I read the Numpy and Scipy documentation, and using numpy itself that has a function called "interp" (as in the documentation above), but my preference is for the "scipy" package that has several forms of interpolation, as in the following example:

First I import numpy and scipy

import numpy as np from scipy import interpolate I now create the data:

data_x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] data_y = [10.0, 20.0, 30.0, 40.0, 50.0, 70.0, 90.0, 100.0, 200.0, 300.0] I now interpolate:

interp = interpolate.interp1d (xdata, ydata) keep in mind that the interp variable now contains an object that is able to do interpolation

Finally, I move to interpolate new data:

new_x = np.arange (1, 10, 0.1) new_y = interp (new_x) Now the variable new_y contains a numpy array, like this in the example case:

array ([10., 11., 12., 13., 14., 15., 16., 17., 18., 18., 19., 20., 21., 22., 23., 24. , 25., 26., 27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39., 40., 41 , 48., 50., 52., 54., 56., 58., 60., 62., 64., 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 91, 92, 93, 94. , 95, 96, 97, 98, 99, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, , 220, 230, 240, 250, 260, 270, 280, 290.)

In addition, if you need to interpolate values in 1D, 2D, 3D, take a look at the docs:

Function in numpy: link

Function in scipy: link

All interpolations: link

Thank you very much for ajdua

    
22.02.2018 / 19:28