R code optimization

3

I wonder if anyone has any suggestions so I can optimize the code below.

The idea I took from this site . You have a permutation of n cards, for example [2, 4, 1, 3] (where 2 is the top card). On each round, the person must revert to the m first cards, where m is the top card. The rounds are repeated until the final permutation is [1, 2, 3, ..., n].

Exemplifying:

[2, 4, 1, 3]
[4, 2, 1, 3]
[3, 1, 2, 4]
[2, 1, 3, 4]
[1, 2, 3, 4]

I used the perm function of the Deducer library to get all possible permutations of a given vector.

#retorna a quantidade de mudanças necessárias para obter o vetor de 1 até n
topswops <- function(x){ 
  i <- 0
  while(x[1] != 1){
    primeira <- x[1]
    if(primeira == length(x)){
       x <- rev(x)
    } else{
      x <- c(x[primeira:1], x[(primeira+1):length(x)])
    }
    i <- i + 1
  }
  return(i)
}

library(Deducer)

inicio <- Sys.time()
resultado <- NULL
for(i in 1:10){
  A <- perm(1:i)
  A <- data.frame(A)
  A$flips <- apply(A, 1, topswops)
  resultado <- rbind(resultado, c(i, max(A$flips)))
}
fim <- Sys.time()
fim - inicio

The processing time was 3.74 minutes. I suspect that the part of the permutations A <- perm(1:i) is the one that takes the most time, but I could not think of any other way.

    
asked by anonymous 14.09.2018 / 18:12

1 answer

2

From what I've analyzed for your code, your suspicion is correct: the perm function is what is slowing down the code. I came to this conclusion by testing the runtime of the topswops function:

library(rbenchmark)
benchmark(topswops(sample(1:i, 10)), replications = 100000)
                       test replications elapsed relative user.self sys.self user.child sys.child
1 topswops(sample(1:i, 10))       100000   2.388        1     2.253    0.107          0         0

Repeat% with 100,000 times in random vectors of size 10, with elements from 1 to 10, took 2.388 seconds. That is, the bottleneck is not here.

That said, I looked for other ways to generate all possible permutations for a sequence of numbers. In addition to topswops , Deducer and combinat packets can also generate all permutations of a sequence of numbers. The results of my test were as follows:

library(Deducer)
inicio <- Sys.time()
resultado.Deducer <- NULL
for(i in 1:10){
  A <- perm(1:i)
  A <- data.frame(A)
  A$flips <- apply(A, 1, topswops)
  resultado.Deducer <- rbind(resultado.Deducer, c(i, max(A$flips)))
}
fim <- Sys.time()
fim - inicio
Time difference of 4.131159 mins

library(combinat)
inicio <- Sys.time()
resultado.combinat <- NULL
for(i in 1:10){
  A <- as.data.frame(matrix(unlist(permn(1:i)), ncol=i, byrow=TRUE))
  A$flips <- apply(A, 1, topswops)
  resultado.combinat <- rbind(resultado.combinat, c(i, max(A$flips)))
}
fim <- Sys.time()
fim - inicio
Time difference of 2.371664 mins

library(gtools)
inicio <- Sys.time()
resultado.gtools <- NULL
for(i in 1:10){
  A <- permutations(i, i, 1:i)
  A$flips <- apply(A, 1, topswops)
  resultado.gtools <- rbind(resultado.gtools, c(i, max(A$flips)))
}
Warning messages:
1: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
2: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
3: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
4: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
5: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
6: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
7: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
8: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
9: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
10: In A$flips <- apply(A, 1, topswops) : Coercing LHS to a list
fim <- Sys.time()
fim - inicio
Time difference of 1.849558 mins

Therefore, one way to optimize code is by precisely improving the permutations. In my test, the function that worked best was gtools , taking 1.849558 minutes to run. This equals 44% of the time used by the original function.

You may be able to further improve this performance by optimizing the generation of permutations. Since there are millions of replications, I believe that any improvement in the generation of permutations is something that counts for the end.

If the search is for the shortest execution time and not necessarily the best algorithm, it is also possible to shorten the execution time by parallelizing the code. But then I leave this exercise to the reader:)

    
14.09.2018 / 21:55