-1

I'm trying to make a simple plot with ggplot2, but this code is not working.

library(ggplot2)

ggplot(data = all_trips) +
    geom_bar(mapping = aes(x = trip_duration,
                           fill = member_casual)) +
    labs(title = "Distribution by Trip duration")

Here is a snapshot of all the data in my data frame: the snapshot of data type.

I'm a newbie so I don't know if I should add more information. Thanks in advance.

Andrea M
  • 2,314
  • 1
  • 9
  • 27
samir
  • 1
  • 1
  • 1
    hello and welcome to SO! it always helps to have the data that was used in the code or another example of data so the problem can be reproduced. My first guess is that you should lose the ```mapping=```. What happens if you do ```ggplot(data=all_trips)+geom_bar(aes(x=trip_duration, fill=member_casual))+ labs(title = "Distribution by Trip duration")``` – Omniswitcher Aug 26 '22 at 13:00
  • 2
    Without an except of your rawdata it is really hard to troubleshoot. Perhaps you can extract a representative sample (not all of your 4 Mio rows!) for us to play around with? – shghm Aug 26 '22 at 13:02
  • Please [do not post code or data in images](https://meta.stackoverflow.com/q/285551/2372064). Share sample data in a [reproducible format](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – MrFlick Aug 26 '22 at 13:29

1 Answers1

0

I think the issue might lie in the choice of geom_ function. In your title you say you want to show a distribution - as you're plotting the distribution of a continuous variable (trip_duration), you need a histogram, not a bar chart (which is suited to show frequencies of a categorical variable).

I recommend using From Data to Viz to check what's the best visualisation depending on your data types.

I'll first create a minimal reproducible example for you (please do this next time you ask a question - as @MrFlick said in the comments, we can't load a screenshot into R. See his links for how to do this).

library(ggplot2)

# Create reproducible example
set.seed(42)
all_trips <- data.frame(
  trip_duration = rnorm(n = 100, mean = 594, sd = 60),
  member_casual = sample(c("member", "casual"), replace = T, 100)
)

# Plot
ggplot(data = all_trips) +
  geom_histogram(mapping = aes(x = trip_duration,
                               fill = member_casual))

Compare with a bar chart, that you can use with categorical data:

ggplot(data = all_trips) +
  geom_bar(mapping = aes(x = rideable_type,
                         fill = member_casual))

Created on 2022-08-27 by the reprex package (v2.0.1)

Andrea M
  • 2,314
  • 1
  • 9
  • 27