Aggregate function in R

Question

Aggregate function in R

Navigation

#1 by (5 votes)
#2 by (2 votes)
#3 by (0 votes)

3

Good afternoon. I'm using the aggregate function to group some data. However, I'm only using one variable to add. I would like to use more than one variable. It is possible? I'm using the following example:

TESTE = aggregate(VALOR ~ REFERENCIA + GRUPO_COPA + CIDADE, data=DADOS,FUN=sum)

I would like to use variable QTDE next to VALOR to add, that is, to add another column, with the following columns:

REFERENCIA, GRUPO_COPA, CIDADE, VALOR, QTDE

Is it possible in the aggregate or in another function this example? Thankful.

Edit

Look at my example using the dput:

structure(list(REFERENCIA = c("JAN_2017", "JAN_2017", "JAN_2017", "JAN_2017", "FEV_2017", "FEV_2017", "FEV_2017", "FEV_2017", "FEV_2017" ), GRUPO_COPA = c("AZUL", "AZUL", "AMARELO", "AMARELO", "VERDE", "VERDE", "VERDE", "AZUL", "AZUL"), CIDADE = c("SP", "SP", "SP", "SP", "RJ", "BSB", "BSB", "BSB", "SP"), VALOR = c(1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000), QTDE = c(1, 3, 5, 7, 9, 11, 13, 15, 17)), .Names = c("REFERENCIA", "GRUPO_COPA", "CIDADE", "VALOR", "QTDE"), row.names = c(NA, 9L), class = "data.frame")

I would like to group this dataset (similar to aggregate or similar) by adding the VALUE and QTDE columns.

r rstudio

asked by anonymous 05.12.2017 / 18:03

3 answers

5

I suggest you use the dplyr package to do this type of operation. Here is an example usage that would solve your problem:

library(dplyr)

x <- mtcars %>%
  group_by(cyl, vs, am) %>%
  summarise(
    valor = sum(mpg),
    qtd = n()
  )

Within the group_by function, you indicate the variables by which you want to do the aggregations (in your case it would be REFERENCIA + GRUPO_COPA + CIDADE ). In the summarise function you indicate the account you need to add. In particular, the function n() returns the row count, which is what you wanted to calculate.

A good reference to learn more about dplyr is the book R for Data Science , especially this chapter .

05.12.2017 / 21:10

0

It was not clear how you wanted the result, so I made the two possible solutions for the quantity column.

Adding the values by quantity:

aggregate((VALOR * QTDE) ~ REFERENCIA + GRUPO_COPA + 
CIDADE, data=da,FUN=sum)

REFERENCIA GRUPO_COPA CIDADE (VALOR * QTDE)
1   FEV_2017       AZUL    BSB         120000
2   FEV_2017      VERDE    BSB         157000
3   FEV_2017      VERDE     RJ          45000
4   JAN_2017    AMARELO     SP          43000
5   FEV_2017       AZUL     SP         153000
6   JAN_2017       AZUL     SP           7000

Adding the values according to the quantities and reporting the unit quantities:

aggregate(cbind(valor = VALOR * QTDE, QTDE) ~ REFERENCIA + 
GRUPO_COPA + CIDADE, data=da,FUN=sum)

REFERENCIA GRUPO_COPA CIDADE  valor QTDE
1   FEV_2017       AZUL    BSB 120000   15
2   FEV_2017      VERDE    BSB 157000   24
3   FEV_2017      VERDE     RJ  45000    9
4   JAN_2017    AMARELO     SP  43000   12
5   FEV_2017       AZUL     SP 153000   17
6   JAN_2017       AZUL     SP   7000    4

11.12.2017 / 21:20

Get date before 2000 MySql Warnings interfere with the program?

score 2 · Accepted Answer

A solution using aggregate() is to report . on the left side of the formula:

dados <- structure(list(REFERENCIA = c("JAN_2017", "JAN_2017", "JAN_2017", "JAN_2017", "FEV_2017", "FEV_2017", "FEV_2017", "FEV_2017", "FEV_2017" ), GRUPO_COPA = c("AZUL", "AZUL", "AMARELO", "AMARELO", "VERDE", "VERDE", "VERDE", "AZUL", "AZUL"), CIDADE = c("SP", "SP", "SP", "SP", "RJ", "BSB", "BSB", "BSB", "SP"), VALOR = c(1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000), QTDE = c(1, 3, 5, 7, 9, 11, 13, 15, 17)), .Names = c("REFERENCIA", "GRUPO_COPA", "CIDADE", "VALOR", "QTDE"), row.names = c(NA, 9L), class = "data.frame")
aggregate( . ~ REFERENCIA + GRUPO_COPA + CIDADE, FUN = sum, data = dados)