0

I have a 3D data matrix (df) of the shape[1:1000,1:221,1:2],

a reproducible example is the following:

d <- as.data.frame( matrix( 1:(5*2*3), 10, 3))
df = array( unlist(d), dim=c(5, 2, 3)) 
df
, , 1

     [,1] [,2]
[1,]    1    6
[2,]    2    7
[3,]    3    8
[4,]    4    9
[5,]    5   10

, , 2

     [,1] [,2]
[1,]   11   16
[2,]   12   17
[3,]   13   18
[4,]   14   19
[5,]   15   20

, , 3

     [,1] [,2]
[1,]   21   26
[2,]   22   27
[3,]   23   28
[4,]   24   29
[5,]   25   30

the first dimension is trails, and the second dimension is outcomes, and the third dimension is people.

For each person, I want to get a graph like the following (a excel plot for the first person, df[,,1])

enter image description here

I want to have such a plot for each person displayed on the same page, but I am stuck on how to achieve this using ggplot.

lll
  • 1,049
  • 2
  • 13
  • 39

1 Answers1

0

Using your data, you can first re-organize your array in a dataframe (there are maybe easier ways to achieve this part):

final_df = NULL
nb_person = 3
trail = NULL
person = NULL
for(i in 1:nb_person) {
  final_df = rbind(final_df, df[,,i])
  trail = c(trail, 1:dim(df[,,i])[1])
  person = c(person,rep(i,dim(df[,,i])[1]))
}
final_df = data.frame(final_df)
colnames(final_df) = c("start","end")
final_df$trail = trail
final_df$person = person

   start end trail person
1      1   6     1      1
2      2   7     2      1
3      3   8     3      1
4      4   9     4      1
5      5  10     5      1
6     11  16     1      2
7     12  17     2      2
8     13  18     3      2
9     14  19     4      2
10    15  20     5      2
11    21  26     1      3
12    22  27     2      3
13    23  28     3      3
14    24  29     4      3
15    25  30     5      3

Then, you can reshape it using pivot_longer function from the package tidyr (if you install and load tidyverse, both tidyr and ggplot2 will be installed and loaded).

library(tidyverse)
final_df_reshaped <- final_df %>% pivot_longer(., -c(trail,person),names_to = "Variable",values_to = "value")

# A tibble: 30 x 4
   trail person Variable value
   <int>  <int> <chr>    <int>
 1     1      1 start        1
 2     1      1 end          6
 3     2      1 start        2
 4     2      1 end          7
 5     3      1 start        3
 6     3      1 end          8
 7     4      1 start        4
 8     4      1 end          9
 9     5      1 start        5
10     5      1 end         10
# … with 20 more rows

Alternative using gather for older versions of tidyr If you have an older version of tidyr (below 1.0.0), you should use gather instead of pivot_longer. (more information here: https://cmdlinetips.com/2019/09/pivot_longer-and-pivot_wider-in-tidyr/)

final_df_reshaped <- final_df %>% gather(., -c(trail,person), key = "Variable",value = "value") 

And plot it using this code:

ggplot(final_df_reshaped, aes(x = Variable, y = value, group = as.factor(trail), color = as.factor(trail)))+
  geom_point()+
  geom_line() +
  facet_grid(.~person)+
  scale_x_discrete(limits = c("start","end"))

enter image description here

Does it answer your question ?

If you have to do that for 220 different person, I'm not sure it will make a really lisible plot. Maybe you should think an another way to plot it or to extract the useful information.

dc37
  • 15,840
  • 4
  • 15
  • 32
  • when I tried, I cannot it returns an error saying that could not find function "pivot_longer". is there any alternative solution? Much appreciated! – lll Dec 20 '19 at 21:04
  • To use `pivot_longer`, you have to install `tidyverse` or at least `tidyr` package. I wrote it in my answer just before the code using it. – dc37 Dec 20 '19 at 21:49
  • yes, I have tidyverse installed in my rstudio, but it still cannot find "pivot_longer", so my computer crashes when I tried to install a newer version of tidyverse. So, I am wondering if there are other ways around . – lll Dec 20 '19 at 22:04
  • what version of R and tidyverse do you have ? – dc37 Dec 20 '19 at 22:09
  • "R version 3.4.1 (2017-06-30)" – lll Dec 21 '19 at 00:06
  • `pivot_longer` was added in the last version of `tidyr` (1.0.0) released this year. So, your version is probably older (as you are using R 3.4.1 and the last version of R is 3.6.1). I edited my answer to include the use of `gather` instead of `pivot_longer`. Maybe you should consider getting the last version of `R` and `tidyverse` – dc37 Dec 21 '19 at 00:59