1

When plotting 3 model metrics (RMSE, MAE, Rsquared) from 3 trained models to a test set metrics, i'm trying to show that Neural Network model is the best as

  • there is least distance between train and test metrics
  • it also has low enough RSME/MAE and high Rsquared

Now it is not that obvious from the attached plot. Also, metrics are on the different scale as Rsquared is in [0,1] interval. Is there a way to plot it better, preferably on the same plot? dotplot

> trn
            model RMSE Rsquared  MAE dataType
1      Linear Reg 9.17     0.51 6.03    train
2      SVM Radial 7.86     0.64 4.86    train
3 Neural Networks 8.55     0.57 5.59    train
> tst
            model RMSE Rsquared  MAE dataType
1      Linear Reg 9.40     0.53 5.95     test
2      SVM Radial 9.16     0.55 5.50     test
3 Neural Networks 8.66     0.60 5.48     test
> 

reproducible code:

trn <- structure(list(model = c("Linear Reg", "SVM Radial", "Neural Networks"),
                      RMSE = c(9.17, 7.86, 8.55), Rsquared = c(0.51, 0.64, 0.57),
                      MAE = c(6.03, 4.86, 5.59)),
                 row.names = c(NA, -3L), class = "data.frame")

tst <- structure(list(model = c("Linear Reg", "SVM Radial", "Neural Networks"),
                      RMSE = c(9.4, 9.16, 8.66), Rsquared = c(0.53, 0.55, 0.6),
                      MAE = c(5.95, 5.5, 5.48)),
                 row.names = c(NA, -3L), class = "data.frame")

trn['dataType'] = 'train'
tst['dataType'] = 'test'

long_tbl <- rbind(trn, tst) %>%
  pivot_longer(cols =!c('model', 'dataType'), names_to = 'metric', values_to='value')

ggplot(long_tbl, aes(x=model, y=value, shape = dataType, colour = metric )) + 
  geom_point()
gregV
  • 987
  • 9
  • 28

1 Answers1

1

The easiest way to do this would be + facet_wrap(~metric, scales="free") but I don't think that satisfies your "in one plot" requirement (it's in one plot statement but three sub-plots). If gg1 is your original plot then this is a pretty compressed format:

print(gg1 
    + facet_wrap(~metric,scale="free_y",ncol=1) 
    + theme_bw() 
    + theme(panel.spacing=grid::unit(0,"lines"),
             strip.background=element_blank(),strip.text.x=element_blank())
)

enter image description here Anything more compressed than this would require you to make some decisions about what information to throw away (e.g., are you willing to rescale all of your metrics to min=0, max=1, or does the magnitude of the differences convey information?

Leaving the strip labels intact would probably make the graph easier to read (users wouldn't have to squint at the legend to figure out which metric was which); you could also try moving the strip labels to the right edge.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • great, I just added `axis.title = element_blank()`. should have mentioned it is going into `rmarkdown` document and it looks fine in there, i just tested – gregV Jan 04 '21 at 19:41
  • that works, you could use `+labs(x="",y="")` if you wanted to blank out both x and y axis labels (slightly cleaner than setting the theme) – Ben Bolker Jan 04 '21 at 20:58