Hello, I have a slightly unbalanced dataset and I wanted to do some testing with smote, but I'm getting an error:
library(DMwR)
treinoSmote <- SMOTE(TARGET ~ .,m,k=5, perc.over = 100, perc.under = 200) Error in factor(newCases[, a], levels = 1:nlevels(data[, a]), labels = levels(data[, : invalid 'labels'; length 0 should be 1 or 2
My TARGET is already a factor, I left it with values 1 and 0, with S and N (YES and NO) etc, always giving this error.
My dataset consists of integer, factor, and numeric features. There are about 20 at the moment.
The only things I see on the internet say it should be factor and tals, but that's it!
I made the test that has in the documentation of the own SMOTE with the dataset iris and it works normal. I checked the type of feature and it's factor as well. I do not understand why you're making this mistake.
data(iris)
data <- iris[,c(1,2,5)]
data$Species <- factor(ifelse(data$Species == "setosa", "rare", "common"))
table(data$Species)
common rare
100 50
newData <- SMOTE(Species ~ ., data, perc.over = 600, perc.under = 100)
table(newData$Species)
common rare
300 350