Dplyr and gsub: how to replace excerpts from one column to another

4

I have the following data-frame:

xis <- data.frame(x1=c("**alo.123", "**alo.132", "**alo.199"), x2=c("sp", "mg", "rj"), x3=c(NA))

I would like to create a new column using gsub as follows:

x3[1] <- gsub("alo", xis$x2[1], xis$x1[1])
x3[2] <- gsub("alo", xis$x2[2], xis$x1[2])
x3[3] <- gsub("alo", xis$x2[3], xis$x1[3])

I would not like to use the for and I know there is a possibility to use maply for this, such as:

xis$x3 <- mapply(gsub,"alo", xis$x2, xis$x1)

Would there be a way to use% dplyr% for this? Something like:

xis <- mutate(xis, x3 = gsub("alo", x2, x1)
    
asked by anonymous 09.10.2014 / 16:20

2 answers

2

You can not use this directly because gsub is not vectorized, so only the first replacement element will be used, replacing everything with sp .

What mapply is doing is vectoring the function, and you could use mapply within mutate without problems:

xis <- mutate(xis, x3 = mapply(gsub, "alo", x2, x1))
xis
         x1 x2       x3
1 **alo.123 sp **sp.123
2 **alo.132 mg **mg.132
3 **alo.199 rj **rj.199

In Daniel's answer, the str_replace function of stringr is basically doing this by vectoring sub with mapply . And the str_replace_all is vectorizing the gsub with mapply .

If you want, you can create your own vectorized function with mapply or Vectorize (a wrapper of mapply ) before using within mutate . For example:

gsub2<-Vectorize(gsub) # vetoriza o gsub 
xis <- mutate(xis, x3 = gsub2("alo", x2, x1))
xis
         x1 x2       x3
1 **alo.123 sp **sp.123
2 **alo.132 mg **mg.132
3 **alo.199 rj **rj.199
    
09.10.2014 / 19:34
3

You can use str_replace of package stringr like this:

require(stringr)
xis <- data.frame(x1=c("**alo.123", "**alo.132", "**alo.199"), x2=c("sp", "mg", "rj"), x3=c(NA))
xis  <- mutate(xis, x3 = str_replace(string = x1, pattern = "alo", replacement = x2))
xis
         x1 x2       x3
1 **alo.123 sp **sp.123
2 **alo.132 mg **mg.132
3 **alo.199 rj **rj.199
    
09.10.2014 / 19:21