How to select data in R?

3

I have a database with 1441 rows. I need to group them into groups of 30, and draw averages from each of those subgroups. Is there any command that will allow me to do all this automatically? "every 30 rows, create a new column and calculate the average". I'm separating the data manually, which will take time. I'm doing this:

primeiro = pott [1:30,c('GPP')]
segundo = pott [31:60,c('GPP')]

And so on, until 1441. I do not see this form as very practical! : s

Thank you in advance for being able to help me

    
asked by anonymous 18.09.2017 / 04:51

2 answers

5

In fact, it's not a good idea to do this by hand, let alone have as many objects in globalenv . It is best to create these sub data.frames in a list, with, for example, split .

set.seed(4577)  # porque vou usar 'rnorm' para criar a data.frame

n <- 1441
pott <- data.frame(GPP = rnorm(n))

fact <- rep(1:(1 + n %/% 30), each = 30)[seq_len(n)]

lista_pott <- split(pott, fact)

Now, to do calculations we use the functions *apply .

medias <- sapply(lista_pott, function(x) mean(x$GPP))
    
18.09.2017 / 10:53
4

Using Rui's sample database, another alternative is:

tapply(pott$GPP, gl(nrow(pott)/30, 30), mean)

Explaining: The gl(nrow(pott)/30, 30) command creates size 30 factors for your database. And tapply makes split with sapply at the same time, applying the mean function to the pott$GPP vector for each factor of 30 observations.

    
19.09.2017 / 01:35