I have a set of 300 and few spreadsheets in which we have to create a function with 3 arguments: the directory where the spreadsheets are, the variable that will be analyzed and the amount of files to analyze.
In the case in question we have two variables of interest, the sulfate concentration and the nitrate concentration.
I have been able to equate the function for two parameters, in which I will return the average sulfate and separately the average nitrate.
Follow the code:
pollutant_sulfate<-function(directory, ID = 1:332) {
files_list <- list.files(directory, full.names=TRUE)
data <- data.frame()
for (i in ID) {
data <- rbind(data, read.csv(files_list[i]))
}
subset_sulfate<- subset(data$sulfate, data$sulfate > 0)
mean (subset_sulfate)
}
pollutant_nitrate<-function(directory, ID = 1:332) {
files_list <- list.files(directory, full.names=TRUE)
data <- data.frame()
for (i in ID) {
data <- rbind(data, read.csv(files_list[i]))
}
subset_nitrate<- subset(data$nitrate, data$nitrate > 0)
mean (subset_nitrate)
}
Now the 3rd argument of the function that would be the determination of which variable I wish to analyze (sulfate or nitrate) I am in difficulties. I thought about building a if
condition. I wrote a code that contains errors and I can not understand what the problem is. Here is the code in question:
mean_pollutant1<-function(directory, pollutant, ID=1:332){
files_list <- list.files(directory, full.names=TRUE)
data <- data.frame()
for (i in ID) {
data <- rbind(data, read.csv(files_list[i]))
}
if (pollutant == sulfate){
subset_sulfate<- subset(data$sulfate, data$sulfate > 0)
mean (subset_sulfate)
}
if (pollutant == nitrate){
subset_nitrate<- subset(data$nitrate, data$nitrate > 0)
mean (subset_nitrate)
}
}
When I try to call the function I get error msg:
mean_pollutant1 ("specdata", sulfate, 1: 2) Error in mean_pollutant1 ("specdata", sulfate, 1: 2): object 'sulfate' not found
Can anyone help me get around the problem?