0

I'm trying to teach myself R by analysing a dataset that I have. It is called audit_data and looks something like this (there are actually many more rows and columns);

id          date       time_1       time_2     time_3    description   comment
 1   10-Oct-2017          120          135         75        xxxxxxx    xxxxxx
 2   05-Jun-2017          140          120         90        xxxxxxx    xxxxxx
 3   31-Aug-2017          133          215        104        xxxxxxx    xxxxxx
 4   22-Sep-2017           95          110        127        xxxxxxx    xxxxxx

I want to create a grouped histogram with ggplot2 showing just the data from columns time_1 and time_3 paired together side-by-side.

I think that in order to do this I need to create a new data set (say audit_data_new) that looks something like this, but I am struggling to get even this far.

time       value
time_1     120
time_1     140
time_1     133
time_1      95
time_3      75
time_3      90
time_3     104
time_3     127

After doing this I plan to use a command like

ggplot(audit_data_new, aes(value, fill=time)) + geom_histogram(position='dodge')

Any advice on how to massage my data into the correct format first?

brad
  • 9,573
  • 12
  • 62
  • 89
  • you probably want "melt" https://www.r-bloggers.com/melt/ – qwr Nov 30 '18 at 05:06
  • See https://stackoverflow.com/questions/3777174/plotting-two-variables-as-lines-using-ggplot2-on-the-same-graph – qwr Nov 30 '18 at 05:09
  • I'd looked at melt and didn't think it would do what I want, but your link explained it better. melt(audit_data, c('id'), c('time_1', 'time_3')) seems to do what I want. Thanks. – brad Nov 30 '18 at 05:51
  • 2
    Possible duplicate of [Reshaping data.frame from wide to long format](https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format) – Carlos Eduardo Lagosta Nov 30 '18 at 09:48

1 Answers1

0

I'm not sure if it's what you want. Hope to help you.

library(tidyverse)

df <- data.frame(time_1 = rnorm(100, 60),
                 time_2 = rnorm(100, 61),
                 time_3 = rnorm(100, 62))

df %>% select(time_1, time_3) %>%
  gather(time, value) %>%
  ggplot(aes(value, fill = time)) +
    geom_histogram(colour = 1, position = "dodge")

enter image description here

Darren Tsai
  • 32,117
  • 5
  • 21
  • 51