How to transform a sequence written in a numerical sequence? (R)

3

I'm having trouble handling a TSE bank. The part below the code imports it:

library(tidyverse)
locais_vot_SP <- read_delim("https://raw.githubusercontent.com/camilagonc/votacao_secao/master/locais_vot_SP.csv",
                        locale = locale(encoding = "ISO-8859-1"),
                        delim = ",",
                        col_names = F) %>% 
              filter(X4 == "VINHEDO")

names(locais_vot_SP) <- c("num_zona", 
                      "nome_local",
                      "endereco",
                      "nome_municipio",
                      "secoes",
                      "secoes_esp")

As you can see, the data of the secoes variable is not properly organized, since different information is aggregated in the same cell.

secoes
196ª; 207ª; 221ª; 231ª;
197ª; 211ª; 230ª; 249ª;

With the following code, I started fixing the problem:

locais_vot_SP <- locais_vot_SP %>% mutate(secoes = gsub("ª", "", secoes)) %>% 
                                   mutate(secoes_esp = gsub("ª", "", secoes_esp)) %>%
                                   mutate(secoes_esp = gsub(";", "", secoes_esp)) %>%
                                   mutate(secoes = gsub("Da ", "", secoes)) %>% 
                                   separate_rows(secoes, sep = ";") %>%  
                                   mutate(secoes = unlist(strsplit(locais_vot_SP$secoes, ";")))

So I got the following:

secoes
32 à 38
100
121

What remains to be solved are cells with x à y . How do I get to the following result?

secoes
32
33
34
35
36
37
38
...
    
asked by anonymous 25.04.2018 / 19:26

1 answer

3

To transform any alphanumeric string of type x não_número y , with x and y two integers in the x:y sequence, can be done as follows.

x <- "32 à 38"
y <- unlist(strsplit(x, "[^[:digit:]]+"))
y <- as.integer(y)
Reduce(':', y)
#[1] 32 33 34 35 36 37 38

This can easily be put into a function.

camila <- function(x){
    y <- unlist(strsplit(x, "[^[:digit:]]+"))
    y <- as.integer(y)
    Reduce(':', y)
}

camila("32 à 38")
#[1] 32 33 34 35 36 37 38

(Of course you should choose another name for the function.)

    
25.04.2018 / 19:58