0

This is the edited question. I would like to create a multi-facet graph based on panel data. While graphing panel data with only one y-variable is relatively simple, I am asking how can I create a graph in R using multiple y-variables that should appear in the same graph. The problem is that I have two "y". Each ggplot has the (aes(x=year, y=something, ... but I have two "y's", namely source1 and source2 and I can't find a solution to create a multi-facet graph that contains both y-variables in the same facet. Please see the description of the panel data below. The panel data that I want to graph in R looks like this: ![enter image description here

Or in txt format:

id  area    element item    year    value   source1 source2 diff    relative
46  Australia   Yield   Barley  2000    20080   2.02    1.67    0.35    17.32673267
46  Australia   Yield   Barley  2010    18268   1.9 1.75    0.15    7.894736842
46  Australia   Yield   Barley  2018    23044   2.46    1.86    0.6 24.3902439
113 Australia   Yield   Maize   2000    58727   5.95    6.88    -0.93   -15.6302521
113 Australia   Yield   Maize   2010    67515   6.97    7.54    -0.57   -8.177905308
113 Australia   Yield   Maize   2018    82594   8   8.54    -0.54   -6.75
238 Australia   Yield   Potatoes    2000    314873  33.74   7.33    26.4    78.27504446
238 Australia   Yield   Potatoes    2010    383895  38.73   7.81    30.92   79.83475342
238 Australia   Yield   Potatoes    2018    426258  42.39   8.56    33.82   79.80655815
2224    Australia   Yield   Rapeseed    2000    12177   1.23    1.11    0.12    9.756097561
2224    Australia   Yield   Rapeseed    2010    11261   1.12    1.23    -0.11   -9.821428571
2224    Australia   Yield   Rapeseed    2018    12280   1.39    1.36    0.03    2.158273381
5557    Australia   Yield   Soybeans    2000    18714   1.87    2.08    -0.21   -11.22994652
5557    Australia   Yield   Soybeans    2010    19042   1.84    2.24    -0.4    -21.73913043
5557    Australia   Yield   Soybeans    2018    17027   1.81    2.43    -0.62   -34.25414365
757 Australia   Yield   SugC    2000    0   0   27.45   -27.45  #DIV/0!
757 Australia   Yield   SugC    2010    0   0   28.76   -28.76  #DIV/0!
757 Australia   Yield   SugC    2018    0   0   30.27   -30.27  #DIV/0!
8890    Australia   Yield   Sunflower seed  2000    10494   1.08    1.02    0.06    5.555555556
8890    Australia   Yield   Sunflower seed  2010    15414   1.25    1.13    0.12    9.6
8890    Australia   Yield   Sunflower seed  2018    18235   1.32    1.26    0.06    4.545454545
33335   Australia   Yield   Wheat   2000    18398   1.9 1.51    0.4 20.52631579
33335   Australia   Yield   Wheat   2010    15987   1.75    1.73    0.02    1.142857143
33335   Australia   Yield   Wheat   2018    19444   2.19    1.99    0.2 9.132420091
48  Brazil  Yield   Barley  2000    19437   2.12    1.88    0.25    11.32075472
48  Brazil  Yield   Barley  2010    33115   3.12    2.08    1.04    33.33333333
48  Brazil  Yield   Barley  2018    32591   3.31    2.33    0.98    29.60725076
115 Brazil  Yield   Maize   2000    27182   2.97    2.64    0.33    11.11111111
115 Brazil  Yield   Maize   2010    43667   4.1 3.76    0.34    8.292682927
115 Brazil  Yield   Maize   2018    51044   5   5.13    -0.13   -2.6
240 Brazil  Yield   Potatoes    2000    171813  17.38   3.59    13.79   79.34407365
240 Brazil  Yield   Potatoes    2010    258859  25.66   3.92    21.74   84.72330475
240 Brazil  Yield   Potatoes    2018    311760  30.59   4.32    26.27   85.87773782
2226    Brazil  Yield   Rapeseed    2000    17083   1.59    1.86    -0.26   -16.98113208
2226    Brazil  Yield   Rapeseed    2010    15217   1.37    2.22    -0.84   -62.04379562
2226    Brazil  Yield   Rapeseed    2018    12500   1.3 2.66    -1.36   -104.6153846
5559    Brazil  Yield   Soybeans    2000    24033   2.5 2.28    0.22    8.8
5559    Brazil  Yield   Soybeans    2010    29475   2.9 2.58    0.32    11.03448276
5559    Brazil  Yield   Soybeans    2018    33903   3.22    2.95    0.28    8.385093168
1478    Brazil  Yield   SugC    2000    0   0   15.01   -15.01  #DIV/0!
1478    Brazil  Yield   SugC    2010    0   0   16.43   -16.43  #DIV/0!
1478    Brazil  Yield   SugC    2018    0   0   18.2    -18.2   #DIV/0!
8892    Brazil  Yield   Sunflower seed  2000    16724   1.53    1.55    -0.01   -1.307189542
8892    Brazil  Yield   Sunflower seed  2010    11322   1.21    1.77    -0.57   -46.28099174
8892    Brazil  Yield   Sunflower seed  2018    16064   1.55    2.05    -0.5    -32.25806452
33337   Brazil  Yield   Wheat   2000    15156   1.81    1.43    0.38    20.99447514
33337   Brazil  Yield   Wheat   2010    28288   2.52    1.94    0.59    23.01587302
33337   Brazil  Yield   Wheat   2018    26238   2.69    2.56    0.13    
...and so on

I've created the panel data in R:

library(plm)
panel <- pdata.frame(data, index = c("id", "year"), drop.index = FALSE)

Then I've tried this plot in ggplot:

ggplot(df, aes(x=year, y=diff, group=area, fill=area)) +
    geom_area() +
    scale_fill_viridis(discrete = TRUE) +
    theme(legend.position="none") +
    ggtitle("Yield") +
    theme_ipsum() +
    theme(
      legend.position="none",
      panel.spacing = unit(0.1, "lines"),
      strip.text.x = element_text(size = 8),
      plot.title = element_text(size=14)
    ) +
    facet_wrap(~item)

however, it does not shows facets on area and then inside each facet the values of source1 and source2 values for each crop. I would like to create something like this:

enter image description here

The problem becomes even more complicate because I have many dimensions: area, item, years and those two y-variables: source1 and source2. The most important is to create facets that show source1 and source2 appear as lines or bars for comparison reasons. But the problem is how to create facets that show this two y-variables by year, area and item? And all of these without producing an overcrowded graph.

Because dc27 asked examples of graph, another possible example would be: enter image description here

where source1 and source2 should be 2 bars side by side showing values per year, per item and per area. If you have any other suggestions on how to plot this panel data you are more than welcome.

Community
  • 1
  • 1
Marian
  • 87
  • 1
  • 3
  • 13
  • 3
    It's not really clear what kind of graph you are looking for, can you clarify ? can you show a drawing of the desired output ? Also, can you edit your question to provide a reproducible example of your dataset (see here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – dc37 Apr 19 '20 at 20:28
  • 1
    I am guessing facet... maybe check this http://www.sthda.com/english/wiki/ggplot2-facet-split-a-plot-into-a-matrix-of-panels out and maybe you can revise your question – StupidWolf Apr 19 '20 at 20:49
  • I will try to explain, so I have 6 regions and 6 crops, over 12 years time period. When using the above code for ggplot in R it creates an graph with all regions and crops which is crowded and you cannot read it well. Out of my sample of 6 regions and 6 crops I would like to create a graph only for 2 crops and 6 regions and then create another graph with the next 2 crops and 6 regions and a third graph with the last 2 crops and 6 regions and then combine the graphs in one big graph. All these using one single panel data. – Marian Apr 19 '20 at 21:16

1 Answers1

0

Here a possible solution to your question is to create a variable that will group crops two by two and you can use in facet_wrap function.

Let's take this fake example that will create 6 regions with 6 different crops with one value per year for 12 years.

df <- data.frame(Region = rep(paste("Region",1:6, sep = "_"), each = 12*6),
                 Crop = rep(paste("Crop",1:6, sep ="_"), 12*6),
                 Year = rep(paste("Year",1:12, sep ="_"), each = 6),
                 Value = runif(432,min = -1, max = 1))

Now, using dplyr, we can calculate the mean of each crop per region over 12 years and add the panel group:

library(dplyr)

df %>% group_by(Region,Crop) %>%
  summarise(Mean = mean(Value)) %>%
  mutate(Panel = rep(paste("Panel",1:3), each = 2)) 

# A tibble: 36 x 4
# Groups:   Region [6]
   Region   Crop       Mean Panel  
   <fct>    <fct>     <dbl> <chr>  
 1 Region_1 Crop_1  0.0323  Panel 1
 2 Region_1 Crop_2  0.200   Panel 1
 3 Region_1 Crop_3  0.00647 Panel 2
 4 Region_1 Crop_4 -0.243   Panel 2
 5 Region_1 Crop_5 -0.433   Panel 3
 6 Region_1 Crop_6 -0.128   Panel 3
 7 Region_2 Crop_1  0.0655  Panel 1
 8 Region_2 Crop_2 -0.174   Panel 1
 9 Region_2 Crop_3  0.316   Panel 2
10 Region_2 Crop_4 -0.131   Panel 2
# … with 26 more rows

Now, if you add pass this pipe sequence into ggplot, you can separate your crop two by two.

df %>% group_by(Region,Crop) %>%
  summarise(Mean = mean(Value)) %>%
  #mutate(Group_Crop = as.numeric(factor(Crop))) %>%
  mutate(Panel = rep(paste("Panel",1:3), each = 2)) %>%
  ggplot(aes(x = Region, y = Mean, fill = Crop))+
  geom_col()+
  facet_wrap(~Panel, nrow = 1)+
  theme(legend.position = "top", axis.text.x = element_text(angle = 90, size=12), plot.title = element_text(hjust = 0.5))+
  geom_hline( yintercept=0, linetype="dashed")+
  labs(title="Average growth rate of yield tonne/ha in selected regions")

enter image description here

Is it what you are looking ?

dc37
  • 15,840
  • 4
  • 15
  • 32
  • Many thanks for your help. However, I have a little more complex panel data. – Marian Apr 29 '20 at 23:36
  • Please edit your question to provide some detailed about your more complex panel data. Please read first this link to know how to provide a good reproducible example: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example. Based on your initial question, my proposed solution should work. If not, please provide a detailed question and what should look like the final graph (make a draw for example). It will help to provide you the best and more adequate answer – dc37 Apr 30 '20 at 01:09