1

I want to create a plot with the following data:

> dput(Data)
structure(list(Subject = c("1", "2", "3", "4", "5", "6", "7", 
"13", "14", "15", "16", "17", "18", "20", "22", "24", "25", "27", 
"28"), `Time given (in seconds)` = c(150, 150, 150, 150, 150, 
150, 150, 300, 300, 300, 300, 300, 300, NA, NA, NA, NA, NA, NA
), `Time spent (in seconds)` = c(63.461, 150.014, 150.113, 150.012, 
150.014, 150.02, 150.012, 159.232, 198.006, 221.791, 264.997, 
293.73, 246.836, 101.439, 151.4, 157.709, 165.721, 319.134, 425.268
), `Number of "a's" counted` = c(78, 55, 83, 76, 73, 45, 50, 
105, 105, 102, 97, 173, 101, 65, 67, 100, 50, 99, 93)), row.names = c(NA, 
-19L), class = c("tbl_df", "tbl", "data.frame"))

As you can see, there are some subjects who got 150 seconds, some of them got 300 seconds, and some of them got unlimited time (=NA) to complete a certain task. I want to plot these 3 groups in a scatter plot that measures time spent on the task on the x=-axis and number of 'A's counted on the y-axis. I created the following things to get this done:

Data1 <- Data[-(8:19),]
Data2 <- Data[(8:13),]
Data3 <- Data[-(1:13),]
x1 <- Data1$`Time spent (in seconds)`
x2 <- Data2$`Time spent (in seconds)`
x3 <- Data3$`Time spent (in seconds)`
y1 <- Data1$`Number of "a's" counted`
y2 <- Data2$`Number of "a's" counted`
y3 <- Data3$`Number of "a's" counted`

And I tried to run the following code to create the scatterplot:

ggplot <- ggplot(Data, aes(x=`Time spent (in seconds)`,y=`Number of "a's" counted`)) + geom_point(colour="blue")
ggplot 

But I don't know how I can create a plot with all 3 groups and then give each of them a different colour. Can somebody help me to create a scatterplot with this data? Thanks in advance!

Waldi
  • 39,242
  • 6
  • 30
  • 78
  • 1
    Are you trying to mix base plot with ggplot? Maybe read about `ggplot(...) + geom_point()` – zx8754 May 20 '21 at 07:51
  • To expand on the comment above: You are creating a Base R `plot` object and name it ggplot, and then try to use `ggplot2` syntax (the `+` to add layers) on it (in combination with further Base R `plot` arguments). That won't work... Either use base R `plot` only (look at the output of `plot(cars)`) or use correct `ggplot2` syntax on `ggplot2` objects ;) – dario May 20 '21 at 08:00
  • Thank you! So now I ran the following code: ```ggplot <- ggplot(Data1, aes(x=`Time spent (in seconds)`,y=`Number of "a's" counted`)) + geom_point(colour="blue")``` How can I add the other 2 plots to this? @zx8754 – Lovemydogsxx May 20 '21 at 08:01
  • You keep adding more layers of geom_point, `ggplot(...) + geom_point() + geom_point(data = ...) + geom_point(data = ...)` Or we make a long dataset and call geom_point once with grouping on colours. – zx8754 May 20 '21 at 08:06

1 Answers1

0

I would use the dplyr package group_by function, to group your Data by Time given (in seconds)

library(dplyr)
library(ggplot2)

##first your grouping variable needs to be categorical (i.e. convert it to a factor):
Data$`Time given (in seconds)` = as.factor(Data$`Time given (in seconds)`)

## then you can group your data by this variable:
grouped_data = Data%>%
  group_by(`Time given (in seconds)`)

##then create the plot. The colour comes in the aesthetics function:
groupedplot = ggplot(grouped_data, aes(x= `Time spent (in seconds)`,
                                       y=`Number of "a's" counted`, 
                                       colour= `Time given (in seconds)`))+
  geom_point()



groupedplot

enter image description here

Addition: change NA to name of "unlimited" category

library(dplyr)
library(ggplot2)
library(tidyr)
Data$`Time given (in seconds)` = as.character(Data$`Time given (in seconds)`)
Data$`Time given (in seconds)` = replace_na(Data$`Time given (in seconds)`, "Unlimited")
grouped_data = Data%>%
    group_by(`Time given (in seconds)`)
  

groupedplot = ggplot(grouped_data, aes(x= `Time spent (in seconds)`,
                                       y=`Number of "a's" counted`, 
                                       colour= `Time given (in seconds)`))+
  geom_point()



groupedplot

enter image description here

Mark Davies
  • 787
  • 5
  • 18
  • 1
    Thanks a lot for this answer! It helped me! – Lovemydogsxx May 20 '21 at 14:35
  • @Lovemydogsxx you can also change the `NA` to something else. I don't know how to do that when the `Time given...` variable is converted to a `factor` as I have above, but straight forward if converted to `character` and the rest of the code and out put still works. I'll add that to the answer – Mark Davies May 20 '21 at 15:21
  • Thank you! That looks much nicer :) – Lovemydogsxx May 21 '21 at 16:06
  • Hey! I also did a second experiment in which some data points lay on top of each other and thus are not seeable. Can I do something with your code such that they are more seeable? – Lovemydogsxx May 22 '21 at 18:06
  • @Lovemydogsxx try this [https://stackoverflow.com/questions/47955292/visualizing-two-or-more-data-points-where-they-overlap-ggplot-r] – Mark Davies May 23 '21 at 19:17