0

I'm just trying to make a simple line plot with two conditions: Standard and Deviant.

The Data in the csv originally look something like this:

enter image description here

And yes, time is supposed to be negative. Time is a variable here that goes from -100 ms (100 ms before the event happened) to 1500 ms (1500 ms after the event happened). Essentially, what I am trying to do is plot how the values (which I will later call amplitude) change over time for both Standard and Deviant Conditions. Something that looks kind of like this: enter image description here

Unfortunately, what I'm getting is this:

enter image description here

Here is my code:

# Libraries
library(ggplot2)

# Plotting
ggplot(data=PupilERP, aes(x=Pt, y=Amplitude, group=Condition)) +
  geom_line() + scale_y_continuous(breaks = seq(-5,15,1)) + scale_x_continuous(breaks = seq(-100,1500,100)) 

Edit: As asked for in the comments, here is the sample data once I have gotten past line 8- occurs after this line colnames(PupilERP) <- c("Pt","Deviant","Standard")

enter image description here

Also, someone else asked for the dput output. Tt was way too long at this point to give you the data (after the colnames line), even up to just 20 points, so after I have done ALL of the reshaping, after here is the actual dput output.

structure(list(Pt = c(13L, 110L, 109L, 108L, 107L, 106L, 105L, 
104L, 103L, 102L, 101L, 99L, 98L, 97L, 96L, 95L, 94L, 93L, 92L, 
91L), Condition = c("Deviant", "Deviant", "Deviant", "Deviant", 
"Deviant", "Deviant", "Deviant", "Deviant", "Deviant", "Deviant", 
"Deviant", "Deviant", "Deviant", "Deviant", "Deviant", "Deviant", 
"Deviant", "Deviant", "Deviant", "Deviant"), Amplitude = c(0.0089, 
-0.0066, -0.0076, 0.0105, 0.0514, 0.111, 0.178, 0.2396, 0.2851, 
0.306, 0.2999, 0.2708, 0.2277, 0.1796, 0.1318, 0.085, 0.0399, 
0.0012, -0.0264, -0.0413)), row.names = c(NA, 20L), class = "data.frame")
tjebo
  • 21,977
  • 7
  • 58
  • 94
SBL
  • 87
  • 10
  • [See here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on making an R question that folks can help with. That includes a sample of data that replicates the issue. Ideally that data sample could just be the data in the form that you're trying to plot—that way, we could skip the 8 lines of data reshaping and focus just on the plotting issue. – camille Jan 19 '20 at 19:36
  • Images are a really bad way of posting data (or code). Can you post sample data in `dput` format? Please edit **the question** with the output of `dput(PupilERP)`. Or, if it is too big with the output of `dput(head(PupilERP, 20))`. – Rui Barradas Jan 19 '20 at 19:44
  • Instead of `group=Condition` try `colour=Condition`. – Rui Barradas Jan 19 '20 at 19:45
  • Camille, The issue is that I'm concerned the problem is with the reshaping itself. However, I included an edit with what the data looks like after line 8. Rui, the `dput` output was too large at the point at which Camille asked for it (even when taking the 1st 20 lines) so I used dput after ALL of the reshaping (aka after line `PupilERP$Amplitude <- as.numeric(PupilERP$Amplitude)` – SBL Jan 19 '20 at 19:57
  • using color instead of group did not solve the issue- it still appears to have the weird feature where it appears the line is jiggling up and down – SBL Jan 19 '20 at 19:58
  • I've removed all the data processing bit which is not relevant to the question. I cannot reproduce this problem with the data you are supplying. It gives a nice curve. – tjebo Jan 19 '20 at 21:44
  • Hi @SBL - has any progress been made on this? Do let us know if there are problems remaining after trying the suggested answer, and if things are resolved then close the question. If you have found an answer elsewhere that works better, be sure to post it yourself for anyone coming by here for help with the same problem in the future! – Andy Baxter Jan 22 '20 at 16:18

1 Answers1

1

The problem has probably arisen in your data processing. Your as.integer call on the Pt column is creating the wrong numbers. This is because after your transposition of it, the Pt variable has become a factor, so '-11' for example has been interpreted as a factor of level 15 (for example) - this has probably in your data led to a duplication of points (you'll notice there are no negative numbers in your graph).

To solve this, before calling as.integer, coerce Pt to a character vector. I've used dummy data to do the following (your problem was not reproducible from the dput part above):

library(ggplot2)
library(tidyr)

# dummy data
df <- read.csv("test.csv")

df <- t(df)
df <- as.data.frame(df)
df <- df[-1,]
colnames(df) <- c("Pt","Deviant","Standard")
df$Pt <- as.integer(as.character(df$Pt))  # the key change - will read neg. numbers
df <- gather(df, Condition, Amplitude, Deviant:Standard)
df$Amplitude <- as.numeric(df$Amplitude)

ggplot(df, aes(Pt, Amplitude, colour = Condition)) + geom_line()

Hopefully that will help solve some problems.

Andy Baxter
  • 5,833
  • 1
  • 8
  • 22