Calculate the output of an operation between two dataframes conditionally

Question

Calculate the output of an operation between two dataframes conditionally

Navigation

#1 by (1 votes)
#2 by (1 votes)

3

Suppose I have these two dataframes:

set.seed(123)
df1<-data.frame(rep=rep(1:4,each=360),parc=rep(1:40,each=36),trat=rep(sample(1:10),each=36),tree=rep(1:36,40),med=1,dap_prev=rnorm(1440, mean = 12))
df2<-data.frame(med=rep(1:18,each=10),trat=rep(sample(1:10)),b0=rnorm(180),b1=rnorm(180))

As% as_%, I need to retrieve the values of df2 and df2$b0 that match the criteria df2$b1 and df1$med == df2$med . Then create a new column in df1$trat == df2$trat whose product is df1 .

I tried this command below, but of course it did not work:

df1$ddap_cm <- df2$b0[df2$med == df1$med & df2$trat == df1$trat] + df2$b1[df1$med == df2$med & df1$trat == df2$trat] * df1$dap_prev

All help is welcome. Thankful.

EDIT:

I ended up finding a very simple solution with dplyr

library(dplyr)
df1 <- left_join(df1, df2, by = c("med", "trat")) # copia as colunas df2$b0 e df2$b1 que cumpram os critérios
df1$ddap_cm <- df1$b0 + (df1$b1*df1$dap_prev)

r

asked by anonymous 04.10.2017 / 14:52

2 answers

MyISAM is the default storage engine on this MySQL server? [closed] Delete select value comparing to another table

score 1 · Answer 1

You have to create the new column first, only then you can assign its calculation values.

i1 <- df1$med == df2$med
i2 <- df1$trat == df2$trat

df1$ddap_cm <- NA
df1$ddap_cm <- df2$b0[i1 & i2] + df2$b1[i1 & i2] * df1$dap_prev[i1 & i2]

Note: If there are NA values in the original tables, you should use which(i1 & i2) to index the columns of interest.

score 1 · Answer 2

A proposal using the basic R. I made a change in Dfs to simulate more real cases where you have data that are not in the two Dfs

set.seed(123)
df1<-data.frame(rep=rep(1:4,each=360),parc=rep(1:40,each=36),trat=rep(sample(1:10),each=36),tree=rep(1:36,40),med=c(rep(1,1000),rep(4,400),rep(20,40)),dap_prev=rnorm(1440, mean = 12))
df2<-data.frame(med=rep(1:18,each=10),trat=rep(sample(1:15,10)),b0=rnorm(180),b1=rnorm(180))

df2$inddf1med=0
for (i in unique(df1$med))df2$inddf1med[df2$med==i]=1
df2$inddf1trat=0
for (i in unique(df1$trat))df2$inddf1trat[df2$trat==i]=1
sel=df2[df2$inddf1med == 1 & df2$inddf1trat==1,]

df1$res=NA
for (i in 1:nrow(sel)){
  selc=df1$med==sel$med[i]&df1$trat==sel$trat[i]
df1$res[selc]=sel$b0[i] + sel$b1[i] * df1$dap_prev[selc]
}
df1f=df1[!is.na(df1$res),]