How to remove a word from a string without changing larger words that contain it

6

I would like to remove a word from a string in R. I was doing the following:

> s <- "ele esta bem mas tambem esta triste"
> stringr::str_replace_all(s, "tambem", "")
[1] "ele esta bem mas  esta triste"

So far, so good. The problem is if I just wanted to take the word "well" out of the text.

> stringr::str_replace_all(s, "bem", "")
[1] "ele esta mas tam esta triste"

In this case the word "too" gets cut off, and I did not want that to happen.

I thought about looking up the word between spaces:

> stringr::str_replace_all(s, " bem ", " ")
[1] "ele esta mas tambem esta triste"

But then, if I searched for the word "he", it would not be removed. Is there any way to remove all words without thinking of all the cases?

    
asked by anonymous 24.02.2016 / 14:30

2 answers

7

I do not understand R, but a little regex, in that specific case you can use the exact anchor ( \b ) to match exactly the word bem

stringr::str_replace_all(s, "\bbem\b", " ")

Related:

What is a boundary \ b in a regular expression?

    
24.02.2016 / 14:37
0

Using the expression suggested by @Molx "\b\s?bem\s?\b" , but with the function gsub()

 gsub("\b\s?bem\s?\b","",s)
    
20.03.2016 / 21:53