Mineration of text with R (stringr)

1

I have a lenght 15 string and I want to remove the first 70 charac. and the last 200 charac. of each.

I tried the following code to remove the beginnings and it did not work:

 texto2009a <- texto2009 %>% map(str_sub(., 1, 72) <- " ")
    
asked by anonymous 25.08.2018 / 22:22

2 answers

4

From the response of @Giovani, I wrote a small function to solve the problem of the difference between what str_sub does and what the question asks.

From page help("str_sub") , section Details :

  

Details

     

Substrings are inclusive - they include the characters at both start   and end positions. str_sub (string, 1, -1) will return the complete   substring, from the first character to the last.

Now the question asks (edited by me)

  

remove the first% of characters% and the last% of characters%

It is therefore necessary to start with m and in the end it will be n .

library(stringr)

str_sub_als <- function(s, primeiros = 70, ultimos = -200){
    str_sub(s, primeiros + 1, ultimos - 1)
}

x <- c("1234567890", "abcdefghijklmnopqrstuvwxyz")

str_sub(x, 3, -4)
#[1] "34567"                 "cdefghijklmnopqrstuvw"

str_sub_als(x, 3, -4)
#[1] "456"                 "defghijklmnopqrstuv"
    
25.08.2018 / 23:49
4

Fictional example, which may be reproduced for you:

x<-c('Bem-vinda ao Stack Overflow em Português')

library(stringr)

str_sub(x, 2, -10) # as aspas do texto também contam como strings. Por isso, adicione uma unidade a mais
#[1] "em-vinda ao Stack Overflow em "

Where, 2 and -10 are respectively the initial and final character quantities you want to remove.

    
25.08.2018 / 23:11