0

I am pulling in data from a csv that looks like (see below Track1 dataset). I am ignoring all the data in Track2-Track5 of Track1 dataset. There are 461 data points in each channel, 369 channels in each track.

Example dataset: Track1
Channel Track_1 Track_2 … Track_5
1         0.02    0.03  … 0.02
1         0.03    0.02  … 0.06
1         0.01    0.03  … 0.01
2         0.02    0.06  … 0.03
2         0.03    0.02  … 0.06
2         0.01    0.03  … 0.01
…
369       0.03    0.02  … 0.06 
369       0.01    0.03  … 0.01
369       0.02    0.01  … 0.02 

I want to change the data to look like this:

Example dataset: DF
Channel_1 Channel_2 … Channel_369 index
   0.02    0.02     …   0.03      1
   0.03    0.03     …   0.01      2 
   0.01    0.01     …   0.02      3
   …
   0.01    0.03     …   0.04      461

What I've tried to do is assign based on criteria to specific columns, but that requires to type out every line from Track_1 to Track_369. I want to compress and simplify, but I cannot figure it out.

DF$Channel_1 <- Track1[Channel == 1, "Track_1"] 
DF$Channel_2 <- Track1[Channel == 2, "Track_1"]
. . . .
DF$Channel_369 <- Track1[Channel == 369, "Track_1"]

Once the data is in DF, I want to plot all the data against my index. Plot(x values will be Track_[i], y value will be index). So I would like my plot to have 369 lines plotted on one graph without having to write every single line. What I've tried is to manually draw every line on the plot, but that requires to write out every line from Track_1 to Track_369.

plot(DF[c("Channel_1","index")], type = "l", col = 1)
lines(DF[c("Channel_2", "index")], type = "l", col = 2) 
. . .
lines(DF[c("Channel_369", "index")], type = "l", col = 369)

The expected output is a graph that plots all 369 columns of data on one plot. The y axis is "index" column, the x axis is the values within the Track_1 through Track_369 columns.

What I'm looking for is a way to simplify the code so that I'm not repeated the same line of code 369 times. Thank you!

Sarah H
  • 1
  • 1
  • 1
    I do not understand the question, I suspect you can get more answers by providing a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) with some example data and expected output. – Alexlok Nov 30 '20 at 17:35
  • Could you share a little but of sample data to illustrate the problem? Something like `dput(DF[1:10, c(1:10, 370)])` for the first 10 rows, first 10 columns, and 370th column. Also please clarify - your first code block makes it look like `Track` is a separate data frame - is this the case? If so, please also share a similar subset of `Track`, e.g., `dput(Track[1:10, ])` (or some other subset that better illustrates your problem) – Gregor Thomas Nov 30 '20 at 17:58
  • 1
    In the process of simplifying my variables, I forgot to change Normal to Track_1. In my full code there is some data conversion steps that I don't need assistance with. Also, that was meant to be assignment, not less than or equal to. I have adjusted it. Thank you. – Sarah H Nov 30 '20 at 18:41
  • Okay, that's clearer. My new question is whether you want `Channel == i` for `Track_i` (it used to look like this), or if you only want `Track_1` for each of the `i` values - which is what it looks like now that you've replaced all the `"Normal"` with `"Track_1"`. – Gregor Thomas Nov 30 '20 at 18:46
  • I renamed DF variables to try and lessen confusion. What I want is Channel == i for Channel_i. The column Channel_i would have 461 rows of data that was extracted from Track_1. Does that make more sense? – Sarah H Nov 30 '20 at 18:54
  • Got it - and as you say "ignoring all the data in Track2-Track5". Posting an answer. – Gregor Thomas Nov 30 '20 at 18:55

1 Answers1

1

I think unstack() is what you want:

## Read in sample data
Track = read.table(text = "Channel Track_1 Track_2 Track_5
1         0.02    0.03  0.02
1         0.03    0.02  0.06
1         0.01    0.03  0.01
2         0.02    0.06  0.03
2         0.03    0.02  0.06
2         0.01    0.03  0.01
369       0.03    0.02  0.06 
369       0.01    0.03  0.01
369       0.02    0.01  0.02", header = T)

result = unstack(Track, Track_1 ~ Channel)
names(result) = paste0("Track_", unique(Track$Channel))
result$index = 1:nrow(result)
result
#   Track_1 Track_2 Track_369 index
# 1    0.02    0.02      0.03     1
# 2    0.03    0.03      0.01     2
# 3    0.01    0.01      0.02     3
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • Thank you! unstack() did exactly what I was trying to do! I didn't realized that function existed. Now that everything is in results, what is the best way to plot channels 1-369 against the index? index on the Y axis, and the channels on the x-axis. – Sarah H Nov 30 '20 at 19:12
  • For plotting like that, your original data is in better shape for `ggplot`. Add the `index` in with `dplyr` - `Track = Track %>% group_by(Channel) %>% mutate(index = 1:n())`, then plot `ggplot(Track, aes(x = Track_1, y = index)) + geom_point()`. (Not really sure what type of plot you want...) – Gregor Thomas Nov 30 '20 at 19:57