Once, here in Stack Overflow, I commented on variable selection look at the link . The problem of selecting variables is similar to the problem of model selection: we are trying to choose the simplest model that explains our data (in statistics, we always want the simplest possible model to describe our data).
But to make a test like this that you want, with a sum of squares, it is necessary for the tested models to be nested. Trouble is, your templates are not nested. It makes no sense to make a hypothesis test of the type
Because they are not more complex and simpler versions of the same model. The nonlinear functions defined by the fct = BC.4()
and LL.3()
arguments are different. Therefore, from the theoretical point of view in the theory of Nonlinear Models (see Bates and Watts, Nonlinear Regression Analysis (1988), pp. 103-104), the test that you are trying to apply makes no sense. It can be done numerically because it is possible to calculate the sums of squares for each of the models, but such a test has no theoretical support.
What can be done is to compare two nested models. For example,
lett.BC5 <- drm(weight ~ conc, data = lettuce, fct = BC.5())
lett.BC4 <- drm(weight ~ conc, data = lettuce, fct = BC.4())
The only difference between the non-linear functions specified by fct = BC.5()
and fct = BC.4()
is that BC.5()
has one more parameter:
summary(lett.BC5)
Model fitted: Brain-Cousens (hormesis) (5 parms)
Parameter estimates:
Estimate Std. Error t-value p-value
b:(Intercept) 1.502065 0.352231 4.2644 0.002097 **
c:(Intercept) 0.280173 0.248569 1.1271 0.288836
d:(Intercept) 0.963030 0.078186 12.3171 6.164e-07 ***
e:(Intercept) 1.120457 0.612908 1.8281 0.100799
f:(Intercept) 0.988182 0.776136 1.2732 0.234846
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error:
0.1149117 (9 degrees of freedom)
summary(lett.BC4)
Model fitted: Brain-Cousens (hormesis) with lower limit fixed at 0 (4 parms)
Parameter estimates:
Estimate Std. Error t-value p-value
b:(Intercept) 1.282812 0.049346 25.9964 1.632e-10 ***
d:(Intercept) 0.967302 0.077123 12.5423 1.926e-07 ***
e:(Intercept) 0.847633 0.436093 1.9437 0.08059 .
f:(Intercept) 1.620703 0.979711 1.6543 0.12908
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error:
0.1117922 (10 degrees of freedom)
In this way, you can compare the lett.BC5
and lett.BC4
templates according to their sums of squares and the hypothesis test defined above:
anova(lett.BC5, lett.BC4)
1st model
fct: BC.4()
2nd model
fct: BC.5()
ANOVA table
ModelDf RSS Df F value p value
1st model 10 0.12498
2nd model 9 0.11884 1 0.4644 0.5127
(see more information at? anova.drc ')
As the p-value was greater than 0.05, we can say that the models are not different from each other, opting in this way for the lett.BC4
, which is simpler.
See that I did not answer the main question. It may be in your interest to decide between comparing the LL
and BC
function families and deciding on the best family of functions to fit your data. Unfortunately, I do not know any statistical method like a hypothesis test to solve this problem. I give you the following two suggestions as to how to decide between LL
and BC
:
1) Choose the best possible model among the LL
and BC
families, using the methodology above. With the best models of each family chosen, analyze the residuals two two models found and, based on the residue analysis, see which model violates less hypotheses.
2) Make a conscious choice. Look in the literature of your area if the models with LL
(log-logistic model) and BC
(Brain-Cousens modified log-logistic) are the most used and why. Or, since you are doing a parametric adjustment of the data, say that you will use either of these two options because of their interpretability or because your data has behavior that resembles some of them. Or, test some other function, like Weibull, because maybe your results will be even better.