To predict using the model set with lm
, you must have a dataframe with the regressor variables at the points you want. The code below creates a sub-df with rows in which insulin
is in the 1st quartile and FIDADE
is in the 2
category.
Assuming the adjusted model is this:
model <- lm(glucose ~ insulin + FIDADE, data = dados)
You can get a prediction interval with:
qq <- quantile(dados$insulin, probs = 0.25)
i1 <- with(dados, qq >= insulin)
i2 <- with(dados, FIDADE == 2)
new <- dados[i1 & i2, c("insulin", "FIDADE")]
predict(model, newdata = new, interval = "prediction", level = 0.95)
# fit lwr upr
#9 108.6813 60.2474 157.1153
#11 118.9752 72.0415 165.9090
Editing.
Given the request in the comment to simulate the 20% increase in the amplitude of the insulin variable, the only problem seems to be the creation of a data set with 20% greater amplitude of insulin in each category. (At least that's what I think makes the most sense.)
rng <- with(dados, tapply(insulin, FIDADE, FUN = range))
rng <- lapply(rng, function(r){
d <- diff(r)
c(max(r) - 1.1*d, min(r + 1.1*d))
})
tmp <- unlist(lapply(names(rng), function(n) rep(as.integer(n), length(rng[[n]]))))
nova_ampl <- data.frame(insulin = unlist(rng), FIDADE = tmp)
rm(rng, tmp)
Now just pass this dataframe on the newdata
argument.
predict(model, newdata = nova_ampl, interval = "prediction", level = 0.95)
# fit lwr upr
#11 94.76547 45.69869 143.8323
#12 136.15787 87.45688 184.8589
#21 101.99931 52.22080 151.7778
#22 182.18353 128.06123 236.3058
#31 136.62942 89.30538 183.9535
#32 186.90710 135.84374 237.9705
#41 144.33280 93.69015 194.9755
#42 188.75920 138.68448 238.8339
Data in dput
format.
dados <-
structure(list(glucose = c(89L, 78L, 118L, 126L, 97L,
158L, 88L, 145L, 126L, 187L, 130L, 187L, 128L, 166L,
143L, 150L, 136L, 134L, 173L, 195L, 145L),
insulin = c(94L, 88L, 230L, 235L, 140L, 245L,
54L, 130L, 22L, 392L, 79L, 200L, 110L, 175L, 146L,
342L, 110L, 60L, 265L, 145L, 165L),
FIDADE = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L)),
class = "data.frame", row.names = c(NA, -21L))