How do I execute the str_detect (stringr) function for more than one variable at a time?

3

I want to filter my database based on two variables: via and city . This filter, however, is done by means of case particles present in these two variables. For example, I want to analyze who took the first route ( 1via ) and who lives in Santa Monica ( Santa Monica ).

Particles would be: 1v and ca of the variables via and city , respectively.

I tried to do this:

library(dplyr)
library(stringr)
library(magrittr)

df1<-data%>%
    filter(stringr::str_detect(via,'1v')%>%
               filter(stringr::str_detect(city,'ca')))

But it did not work.

Actually, I tried several combinations, but I did not get to the expected result.

dput to help with the response:

data=structure(list(bin = c(0, 0, 0, 0, 1, 1, 0, 0, 1, 1), group1 = c(1, 
2, 2, 1, 2, 1, 2, 1, 2, 1), missing = c(NA, 4, 5, NA, 7, 6, NA, 
NA, 4, 5), score1 = c(3, 2, 4, 4, 7, 6, 4, 3, 6, 7), valor = c(100, 
200, 321, 34, 3424, 2344, 4232, 43, 22, 22), gender = c("M", 
"M", "M", "M", "M", "F", "F", "F", "F", "F"), via = structure(c(2L, 
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), .Label = c("1via", "2via"
), class = "factor"), income = c(1605.52545357496, 1957.10460608825, 
3463.77286640927, 2241.49697413668, 2575.95523679629, 3004.28174249828, 
3458.30937661231, 1786.68619645759, 2065.093211364, 1561.55416276306
), city = c("San Francisco", "Santa Monica", "Santa Monica", 
"Santa Monica", "Santa Monica", "Hollywood", "Hollywood", "Hollywood", 
"Hollywood", "Hollywood"), desbloq = structure(c(10553, 9537, 
10553, 10553, 9212, 10658, 10957, 11822, 11822, 10188), class = "Date"), 
trans = structure(c(10556, 9541, 10555, 10554, 9218, 10660, 
10958, 11823, 11826, 10190), class = "Date")), .Names = c("bin", 
"group1", "missing", "score1", "valor", "gender", "via", "income", 
"city", "desbloq", "trans"), row.names = c(NA, -10L), class = "data.frame")
    
asked by anonymous 16.10.2018 / 04:23

1 answer

3

There is an error in the code. The first parenthesis is closing at the end of the second filter . I use asterisks to highlight this in the code below:

library(dplyr)
library(stringr)
library(magrittr)

df1<-data%>%
    filter*(*stringr::str_detect(via,'1v')%>%
               filter(stringr::str_detect(city,'ca'))*)*

The correct one is

df1<-data%>%
    filter(stringr::str_detect(via,'1v'))%>%
    filter(stringr::str_detect(city,'ca'))
df1
  bin group1 missing score1 valor gender  via   income         city    desbloq      trans
1   0      2       4      2   200      M 1via 1957.105 Santa Monica 1996-02-11 1996-02-15
2   0      1      NA      4    34      M 1via 2241.497 Santa Monica 1998-11-23 1998-11-24

In addition, it is redundant to load a package with the library(stringr) command and call its function using stringr::str_detect . If the package has been loaded, it is possible to leave the code cleaner by calling the function directly by its name:

df1<-data%>%
    filter(str_detect(via,'1v'))%>%
    filter(str_detect(city,'ca'))
df1
  bin group1 missing score1 valor gender  via   income         city    desbloq      trans
1   0      2       4      2   200      M 1via 1957.105 Santa Monica 1996-02-11 1996-02-15
2   0      1      NA      4    34      M 1via 2241.497 Santa Monica 1998-11-23 1998-11-24
    
16.10.2018 / 12:13