-3

First ever query here! Hope to get a good response. I am trying to plot the following table in R.

dfr <- data.frame (
Member = c("Old", "New", "Old", "New", "Old", "New", "Old", "New"), 
Years = c(2005,2005, 2006,2006,2013,2013,2014,2014), 
Trust = c(42.3, 56.70, 45.30, 61.40, 26.80, 45.50, 33.5, 50.60), # these are percentages 
mistrust = c(45.50, 28.50, 42, 25.20, 62.70, 42.90, 54.20, 34.20))

Just so you know, the member means old and new EU member states, and years are the year I am interested to know the level of trust each group of new and old EU members had in the EU.

Question here is I want to plot all four variables, where the graph shows how Trust and/or mistrust by different members (new and old) varies between four different years.

I hope it makes sense what I am asking!

Thanks

  • I'd suggest something like this: https://stackoverflow.com/questions/25070547/ggplot-side-by-side-geom-bar , but you don't have to reshape your data. Put `Years` on the x axis, the `Member` to fill the colour and the `Trust` percentages on the y axis. Then repeat with the `mistrust` percentages on the y axis. I think it will be too much to have both percentages in the same plot. – AntoniosK Nov 08 '17 at 21:37

1 Answers1

0

How about this - using dplyr and ggplot2?

dfr <- data.frame (
  Member = c("Old", "New", "Old", "New", "Old", "New", "Old", "New"), 
  Years = c(2005,2005, 2006,2006,2013,2013,2014,2014), 
  Trust = c(42.3, 56.70, 45.30, 61.40, 26.80, 45.50, 33.5, 50.60), # these are percentages 
  mistrust = c(45.50, 28.50, 42, 25.20, 62.70, 42.90, 54.20, 34.20))

Your data are in wide format, where column names give information. We need it in long format, for which we use 'gather'.

library(tidyverse)

dfr.g <- dfr %>%
  gather(key=Type, value=Score, Trust,mistrust,-Member,-Years)

Then we make a parallel plot in ggplot2, as shown.

ggplot(dfr.g,
       aes(x=Years, y=Score, colour=Type,
           group=interaction(Member,Type), fill=Member)) +
  geom_line() +
  geom_point(size=2, shape=21, colour="grey50") +
  scale_fill_manual(values=c("black","white")) +
  scale_x_continuous(breaks=seq(from=2005,to=2014,by=2))

Derived from this answer - How to plot parallel coordinates with multiple categorical variables in R

astaines
  • 872
  • 2
  • 9
  • 20