0

I have a box plot with jittered data points that looks like the following. The x-axis has three different regions and y-axis has salaries.

enter image description here

I used the following piece of code to plot.

ggplot(df1, aes(x = Region, y = Income, fill = Sex, color = Sex), size = 10) + 
  geom_boxplot(outlier.shape = NA)+
  geom_point(position = position_jitterdodge(0.5), alpha=0.3)

One additional thing I am trying to do is to add different shapes for data points based on different ethnicities. For example, both males and females are divided into caucasian and african-american groups. I have a column Ethnicity in my data table. I would like to make the data points corresponding to caucasian to trinagle-shaped and african-american to circle in the plot.

The data table looks like this

   Region Income Sex    Ethnicity       
   <chr>   <dbl> <chr>  <chr>           
 1 Area1    2000 male   African-american
 2 Area1    3000 female African-american
 3 Area1    2000 male   African-american
 4 Area1    4000 male   African-american
 5 Area1   40050 female African-american
 6 Area1   60000 male   African-american
 7 Area1    2000 male   Caucasian       
 8 Area1    4562 female Caucasian       
 9 Area1    4568 male   Caucasian       
10 Area1    6573 male   Caucasian

 
   

I am just wondering if it is possible in the box plot. I just tried with the following, but it does not work.

 ggplot(df1, aes(x = Region, y = Income, fill = Sex, color = Sex, shape = Ethnicity), size = 10)

It looks like this now enter image description here

I would appreciate any help.

SBDK8219
  • 661
  • 4
  • 11
  • 1
    Could you please clarify what you mean by "it does not work"? Did you get an error? Did you get the wrong shapes? Also. It would be easier to help you if you provide [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including a snippet of your data or some fake data. – stefan Feb 25 '22 at 22:58
  • Thanks @stefan !! I just edited the question and put the data table and the plot. – SBDK8219 Feb 26 '22 at 00:08

1 Answers1

3

Add ethnicity to the aes() of geom_point instead of where you have it.

image

df %>%
    ggplot(aes(x = Region, y = Income, fill = Sex, color = Sex), size = 10) +
    geom_boxplot(outlier.shape = NA)+
    geom_point(aes(shape = Ethnicity), position = position_jitterdodge(0.5), alpha=0.3)
stomper
  • 1,252
  • 1
  • 7
  • 12