I have a large database of 1800 rows and 50 columns.
Of these 50 columns 3 of them (Density, Biomass and BMI) are responses. I should compare it one by one with the other variables. I'm doing tests of normality, ANOVA, Tukey etc. An exploratory analysis to describe the behavior between them. There is a person helping me but I would like to solve some problems that sometimes prevent me from going ahead alone and sometimes that person is very busy. Another detail is that I am a beginner but enthusiastic now with R
.
LET'S GO TO THE PROBLEM ....
I do all the tests with Density (at least 10) and when I change the Biomass response variable, the R
returns me with the Density occurrences.
We create a subset for the other response variables (Biomass and BMI).
But here comes the problem: in some cases I have variables that have less than 1% participation in the data and an "N" of less than 5.
I need to remove this from the analysis and I do not know how to do it in the subset created.
In the case of the first response variable (Density) we use the command: %in%
C ("variable to be removed from the analysis")
however in the case of the subset the script changes and I do not know how to insert this command or another.