0

I am trying to visualize my PCA analysis using ggplot but the output plot only shows 16 out of my 24 samples.

The data frame I created with my PCA data has 24 observations of 24 variables (24 samples, 24 PCAs), but ggplot is only plotting 16 out of the 24. Here is my code and mock data frame.

ggplot(data) +
  aes(x=PC1, y=PC2) +
  geom_point(size=3) +
  coord_fixed() +
  theme_bw()

Data frame

      PC1    PC2
    <dbl>  <dbl>
 1 -40.8  -20.6 
 2 -40.6  -19.0 
 3 -40.8  -20.6 
 4   8.01 -38.1 
 5   8.52 -36.3 
 6   8.01 -38.1 
 7 -39.7   -6.11
 8 -38.1   -5.76
 9 -39.7   -6.11
10  18.3  -33.9 
11  17.9  -33.3 
12  18.3  -33.9 
13 -32.9   11.2 
14 -31.7    9.49
15 -32.9   11.2 
16  50.9   -4.98
17  49.4   -5.64
18  50.9   -4.98
19 -38.7   56.9 
20 -38.0   54.9 
21 -38.7   56.9 
22  74.8   36.3 
23  72.8   34.1 
24  74.8   36.3 
Maria
  • 1
  • 1
  • 2
    Welcome to SO! It would be easier to help you if you provide [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including a snippet of your data or some fake data so that one can run your code. Also, could you clarify what you mean by "not showing all my data points"? Most likely reason is that you don't see all points because of overplotting, i.e. some points are plotted on top of each other. – stefan Jan 31 '23 at 00:47
  • 1
    Please provide data as plain text. We can't read data into R from images. – neilfws Jan 31 '23 at 00:48
  • 2
    Check rows 4 and 6. They're the exact same values to the points are overplotted exactly, as @stefan suspected. Several other rows are also duplicates. – Dubukay Jan 31 '23 at 01:12
  • use jitter. (e.g. `geom_jitter` or `geom_point(position= “jitter”)` or `geom_point(position= position_jitter())` (the latter of which I prefer) – tjebo Jan 31 '23 at 10:48

1 Answers1

1

You could use geom_count to count overlapping points and use scale_size_area to scale the size of the points like this:

library(ggplot2)
ggplot(data) +
  aes(x=PC1, y=PC2) +
  geom_count() +
  coord_fixed() +
  theme_bw() +
  scale_size_area(breaks = c(1,2))

Created on 2023-01-31 with reprex v2.0.2

Quinten
  • 35,235
  • 5
  • 20
  • 53
  • quinten that’s not a bad suggestion. would you care adding this to the duplicate thread so that this is more visible? that would be appreciated. – tjebo Jan 31 '23 at 10:51
  • 1
    Hi @tjebo, I just posted an answer to the thread. Thank you for your suggestion! – Quinten Jan 31 '23 at 11:18