0

I've tried to place multiple geom_area function with for y axis values with Year as the x-axis. I'm very new to R so sorry if this is something very simple.

This was my result

What's the best way to stack these area charts?

c <- ggplot(data=fbi, aes(x=Year))
c + geom_area(aes(y=Aggravated..assault, fill="Orange")) +
  geom_area(aes(y=Robbery, fill="Blue")) +
  geom_area(aes(y=Rape..legacy..definition4., fill="Red")) +
  geom_area(aes(y=Murder.and.nonnegligent..manslaughter, fill="Green")) 

This is the structure of the data for added context:

 $ Year                                         : num  1997 1998 1999 2000 2001 ...

 $ Population                                   : num  2.68e+08 2.70e+08 2.73e+08 2.81e+08 2.85e+08 ...

 $ Violent.crime                                : num  5.37e+08 5.42e+08 5.47e+08 5.64e+08 5.72e+08 ...

 $ Violent..crime..rate.                        : num  8.07e+08 8.14e+08 8.21e+08 8.47e+08 8.59e+08 ...

 $ Murder.and.nonnegligent..manslaughter        : int  18208 16974 15522 15586 16037 16229 16528 16148 16740 17309 ...

 $ Murder.and..nonnegligent..manslaughter..rate.: num  6.8 6.3 5.7 5.5 5.6 5.6 5.7 5.5 5.6 5.8 ...

 $ Rape..revised..definition3.                  : int  NA NA NA NA NA NA NA NA NA NA ...

 $ Rape..revised..definition...rate             : num  NA NA NA NA NA NA NA NA NA NA ...

 $ Rape..legacy..definition4.                   : int  96153 93144 89411 90178 90863 95235 93883 95089 94347 94472 ...

 $ Rape..legacy..definition...rate              : num  35.9 34.5 32.8 32 31.8 33.1 32.3 32.4 31.8 31.6 ...

 $ Robbery                                      : int  498534 447186 409371 408016 423557 420806 414235 401470 417438 449246
Yaakov Bressler
  • 9,056
  • 2
  • 45
  • 69
  • Hi Joseph. Welcome to SO. First. What exactly is the issue? Second. From your code I would suggest to convert your data from wide to long format. Finally, to help us to help you I would suggest to make your example reproducible by adding a snippet of your data, e.g. type `dput(head(fbi, 20))` (for the first 20 rows of data) into the R console and paste the output starting with `structure(...` into your post. – stefan Sep 12 '20 at 17:47

1 Answers1

0

Here is an example of what can be a solution. Test data set in the end.
This type of problems generaly has to do with reshaping the data. The format should be the long format and the data is in wide format. See this post on how to reshape the data from long to wide format.

In this case, before reformating I will compute the accumulated total crimes per year, meaning, the cumsum by rows of the crimes variables. With the data I am using they are columns 1 to 4.

library(ggplot2)

fbi[1:4] <- t(apply(fbi[1:4], 1, cumsum))
fbi_long <- reshape::melt(fbi, id.vars = "Year")
head(fbi_long)

ggplot(fbi_long, aes(Year, value, fill = variable)) +
  geom_area()

enter image description here


Edit

In order to plot only certain crimes types, comething like the following can be used.

Define a vector of the crimes to show up in the plot, then subset the data based on that vector.

crimes_to_plot <- c("Aggravated assault", "Robbery")

ggplot(subset(fbi_long, variable %in% crimes_to_plot), 
              aes(Year, value, fill = variable)) +
  geom_area()

Or, a tidyverse solution. This is a pipe from the original fbi data set.

library(dplyr)
library(tidyr)

fbi %>%
  pivot_longer(
    cols = -Year,
    names_to = "variable",
    values_to = "value"
  ) %>%
  filter(variable %in% crimes_to_plot) %>%
  ggplot(aes(Year, value, fill = variable)) +
  geom_area()

Test data

fbi <- iris[1:20, 1:4]
fbi$Year <- seq.int(nrow(fbi))
names(fbi)[1:4] <- c("Aggravated assault", "Robbery", "Rape", "Murder")
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • Thank you very much. Reshaping the data was the answer. – Joseph Obonyo Sep 12 '20 at 18:50
  • How can I choose exactly which columns go into the slicing? such as fbi[1:4] . Or, when using the value object in aes(Year, value), how can I choose which columns to include? – Joseph Obonyo Sep 12 '20 at 19:12
  • @JosephObonyo The columns that go into the slicing are all the crime columns. After reshaping, to include only some types of crimes, use `ggplot(subset(fbi_long, cols %in% variable), etc)` where `cols` is a vector of types of crimes (the column names before reshaping). – Rui Barradas Sep 12 '20 at 19:23
  • Very helpful @RuiBarradas. I also did some more digging and was able to use measure.vars = c("Robbery", "Rape") as a part of the melt function. – Joseph Obonyo Sep 12 '20 at 19:30
  • @JosephObonyo I have edited with possible solutions to that. `measure.vars` is not one of them but it certainly is a good idea. – Rui Barradas Sep 12 '20 at 19:32