0

I have a dataframe (TablaTotal) with the following levels:

> levels(TablaTotal$Enfermedad)
[1] "Disferlina" "OPMD"       "Laminas"    "Pompe"      "Fshd"       "SarcoG"     "Calpaina"   "Distrofina" "ANO5"

And I want to filter it by levels. The levels to filter are chosen in a checkboxGroupInput (TablasSeleccionadas):

checkboxGroupInput("TablasSeleccionadas", 
   h3("Entrenar para:"),
   choices = list("Disferlina" = "Disferlina", 
                  "OPMD" = "OPMD", 
                  "Laminas" = "Laminas",
                  "Pompe"="Pompe",
                  "Fshd"="Fshd",
                  "SarcoG"="SarcoG",
                  "Calpaina"="Calpaina",
                  "Distrofina"="Distrofina",
                  "ANO5"="ANO5")
)

I create the subset SubTablaTotal through the subset() function:

SubTablaTotal <<- subset(TablaTotal,Enfermedad %in%c(input$TablasSeleccionadas))

And apparently the result is valid, at least it looks ok when I view the dataframe through view(SubTablaTotal): I only have the levels chosen on the check boxes. But when I check the levels, I get the following result:

> levels(SubTablaTotal$Enfermedad)
[1] "Disferlina" "OPMD"       "Laminas"    "Pompe"      "Fshd"       "SarcoG"     "Calpaina"   "Distrofina" "ANO5"

I do not expect to have all levels. Then when creating a DataPartition i get the following warning:

Warning in createDataPartition(SubTablaTotal$Enfermedad, p = input$indiceP, : Some classes have no records ( Fshd, SarcoG, Calpaina, Distrofina, ANO5 ) and these will be ignored

And when trying to train a model I get the following error:

Error in : One or more factor levels in the outcome has no data: 'Fshd', 'SarcoG', 'Calpaina', 'Distrofina', 'ANO5'

Am I missing any point in the subset() functionality?

Thanks in advance.

Pepv
  • 76
  • 15

1 Answers1

-1

Following A. Suliman comment, this issue can be fixed with

SubTablaTotal$Enfermedad<-factor(SubTablaTotal$Enfermedad) 
Pepv
  • 76
  • 15