Are there any R functions similar to Excel's PROCV?

7

In my case I have two data.frames:

> head(Trecho)
         Xt       Yt Zt
1 -75.56468 1.642710  0
2 -74.56469 1.639634  0
3 -73.56469 1.636557  0
4 -72.56470 1.633480  0
5 -71.56470 1.630403  0
6 -70.56471 1.627326  0

    > head(TrechoSim)
        Xs        Ys Zs
1 -71.7856 -0.509196  0
2 -71.7856 -0.509196  0
3 -71.7856 -0.509196  0
4 -71.7856 -0.509196  0
5 -71.7856 -0.509196  0
6 -71.7856 -0.509196  0

The data frame Trecho has approximately 5,000 rows and TrechoSim has 20,000 rows. Similar to PROCV of Excel, I need to fetch the closest value where Xt = Xs (in excel I use TRUE, and returns the first closest value of Xt). There is no tolerance for this closeness. I need all values of the data frame Trecho with their respective value closest to TrechoSim . I've tried difference_inner_join but it returns NA values on some lines.

Thank you,

    
asked by anonymous 18.05.2017 / 14:46

1 answer

4

I do not have the original datasets or Excel installed to test the PROCV function, but I think the code below solved the problem.

The procura function calculates the difference, in absolute value, between a number and a vector and finds which position of the vector is closest to that number.

The code is not optimized, but I imagine it should run reasonably fast on current computers. I tested the same code by increasing the sample sizes of simulated data to 5000 and 20000 and my code took less than 2 seconds to make all comparisons.

Trecho    <- data.frame(Xt=rnorm(5),  Yt=rnorm(5),  Zt=0)
TrechoSim <- data.frame(Xs=rnorm(20), Ys=rnorm(20), Zt=0)

procura <- function(x, y){
  return(which.min(abs(x-y)))
}

index <- 0

for (j in 1:length(Trecho$Xt)){
  index[j] <- procura(Trecho$Xt[j], TrechoSim$Xs)
}

Trecho
TrechoSim[index, ]
    
18.05.2017 / 22:59