2

I am trying to assign different colors for variables in a PCA biplot. However, fviz_pca_biplot from the R package factoextra can not plot the correct color for each variable.

library(factoextra)

data(iris)
res.pca <- prcomp(iris[, -5],  retx = TRUE, center = TRUE, scale. = TRUE)
res.pca

my.col.var <- c("red", "blue", "red", "yellow")

fviz_pca_biplot(res.pca, repel = TRUE, axes = c(1, 2), 
                col.var = my.col.var, col.ind = "#696969", 
                label = "var", title = "")

I have assigned "red", "blue", "red", "yellow" for variables "Sepal.Length", "Sepal.Width", "Petal.Length", and "Petal.Width". However, the figure shows wrong colors for all variables.

enter image description here

Yang Yang
  • 858
  • 3
  • 26
  • 49

2 Answers2

1

In the function we must indicate the name of the variables for col.var= and not the colors. we can then give our color manually to palette= option. So the code would be:

library(factoextra)

data(iris)
res.pca <- prcomp(iris[, -5],  retx = TRUE, center = TRUE, scale. = TRUE)
res.pca
my.col.var <- c("red", "blue", "red", "yellow")

fviz_pca_biplot(res.pca
                , repel = TRUE
                , axes = c(1, 2)
                , col.var = c("Sepal.Length", "Sepal.Width",  "Petal.Length", "Petal.Width" )
                , col.ind = "#696969"
                , label = c("var")
                , title = ""
                , palette = my.col.var
                )

enter image description here

S-SHAAF
  • 1,863
  • 2
  • 5
  • 14
  • Thanks a lot for your help! In the help page of `fviz_pca_biplot`, it shows an example using "col.var = "steelblue". So I assume the help page is not accurate in this case. – Yang Yang Apr 08 '23 at 20:24
  • 1
    This is true for a single color value which is used for automatic coloring and we do no need to show in the legend as all have a single color. But here your idea, which is an interesting topic, is to color by variables (here each variable considered as one group and together 4 groups or variables) and they must also appear in the legend . Then, `col.var=` could be used to define `groups/variables` and the `palette` for coloring the `groups/variables`. – S-SHAAF Apr 09 '23 at 00:04
1

ggbiplot is based on ggplot() object, therefore we could use scale_color_manual:

library(factoextra)

my.col.var <- c("red", "blue", "red", "yellow")

fviz_pca_biplot(res.pca, repel = TRUE, axes = c(1, 2), 
                col.var = colnames(iris)[1:4], col.ind = "#696969", 
                label = "var", title = "")+
  scale_color_manual(values = my.col.var)

enter image description here

TarJae
  • 72,363
  • 6
  • 19
  • 66