How to plot learning curves for binary data?

Question

I would like to plot simple learning curves. My data looks like this:

id trial type choice
1  1     A     0
1  2     A     1
2  1     B     1
2  2     B     0

structure(list(id = c(2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
4L, 4L, 4L, 4L, 4L, 6L, 6L, 6L, 6L, 6L), trial = c(1L, 2L, 3L, 
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 
5L), choice = c(0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 
0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L), type = structure(c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L), .Label = c("A", "A3", "B"), class = "factor")), row.names = c(1L, 
2L, 3L, 4L, 5L, 31L, 32L, 33L, 34L, 35L, 61L, 62L, 63L, 64L, 
65L, 91L, 92L, 93L, 94L, 95L), class = "data.frame")

ID, Trial and Type are integers and Choice is a factor. I would like to plot the choice the different groups have made per trial. How I imagine the graph (a 1 in the vector choice is consider correct):

How I imagine the graph The smoothness of the curves is an exaggeration.

I would also like to know how can I do calculations by coupling groups. For example, sum all the choices of group A during trials 1 to 10.

Thank you for your help!

It'd be helpful to share enough data to actually generate that plot using (eg) `dput()`. See https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — alan ocallaghan, Mar 02 '20 at 16:36
I don't really understand what you are asking and the sample data you provided seems too small to illustrate the point. I'm not really sure how you would get smooth curves out of that. Is choice 1 the "correct" answer? — MrFlick, Mar 02 '20 at 16:40
Thank you for suggesting me `dput()` @alanocallaghan . I hope now is easier to understand what I mean — Maria Granell Ruiz, Mar 02 '20 at 18:44
@MrFlick Regarding the size of the data; I've just provided a bigger sample. The column `choice` is binary where "0" represents wrong, and "1" correct. My goal would be to sum the value of `choice` based on `group` relative to `trial` and then plot this in a graph. I don't expect the lines to be smooth, is a quick sketch. — Maria Granell Ruiz, Mar 02 '20 at 19:05

score 0 · Accepted Answer · answered Mar 02 '20 at 19:14

Basically you want to summarize your data first, then plot it. You can do this easily with dplyr and ggplot2 for example if your data is stored in a data.frame named dd

library(dplyr)
library(ggplot2)
dd %>%
  group_by(type, trial) %>% 
  summarize(correct=mean(choice)) %>% 
  ggplot() + 
  geom_line(aes(trial, correct, color=type))

For each type and trial we calculate the mean value of choice to get the percent of people who answered correctly. Then we plot that value for each trial with a line that's colored by the type.

How to plot learning curves for binary data?

1 Answers1