0

Aesthetics must be either length 1 or the same as the data (5): fill, y and axis1 Any suggestion would be appreiated. Thank you!

Here's the background:

dim(fly)
Rows: 21,000
Columns: 4

head(fly,5)

Date.       Airport.            Count     Type
2022-01-02  Brussels             256      Arrival   
2022-01-24  Charleroi            84       Departure
2022-02-03  Berlin              148       Departure 
2022-03-18  Dresden               95      Arrival   
2022-03-19  Erfurt                29      Departure 
2022-04-01  Frankfurt           391       Departure
structure(list(Date = structure(c(1640995200, 1640995200, 
1640995200, 1640995200, 1640995200, 1640995200, 1640995200, 1640995200, 
1640995200, 1640995200), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), Airport = c("Brussels", "Charleroi", "Berlin - Brandenburg", 
"Dresden", "Erfurt", "Frankfurt", "Muenster-Osnabrueck", "Hamburg", 
"Cologne-Bonn", "Dusseldorf"), Count = c(148, 54, 148, 
5, 0, 391, 6, 78, 60, 103), Type = c("Departure", "Departure", 
"Arrival", "Departure", "Arrival", "Arrival", "Departure", 
"Arrival", "Departure", "Arrival")), row.names = c(NA, -10L
), class = c("tbl_df", "tbl", "data.frame"))

The requirement is to compare arrival flight counts only from the 3 airports - Brussels, London, Dresden using alluvial.

The below code works but it's producing the total (5 months) count instead of the total for each month/airport.

df_fly <- filter(fly, Airport %in% c("Brussels", "Dresden", "London"), Type =="Arrival") %>% 
  group_by(Airport) %>% 
  summarise(Flight_Count = sum(Flight_Count))
df_fly<- as.data.frame(df_fly) 
ggplot(df_fly,
aes(y = Count, axis1 = Airport, axis2 =Count)) + geom_alluvium(aes(fill = Airport), width = 1/8) +
geom_stratum(width = 1/8, fill = "black", color = "grey") + geom_label(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Airport", "Count"),
expand = c(.05, .05)) +
scale_fill_brewer(type = "qual", palette = "Set2") +
ggtitle("Arrival Flight Comparison")

I then tried using this one to populate in a monthly manner per airport but it produced an error:

Aesthetics must be either length 1 or the same as the data (5): fill, y and axis1

df_fly <- filter(fly, Airport %in% c("Brussels", "Dresden", "London"), Type =="Arrival") %>%
  group_by(month = lubridate::floor_date(Date, 'month')) %>%
    summarize(Count = sum(Count))

df_fly<- as.data.frame(df_fly) 
ggplot(df_fly,
aes(y = Count, axis1 = Airport, axis2 =Count)) + geom_alluvium(aes(fill = Airport), width = 1/8) +
geom_stratum(width = 1/8, fill = "black", color = "grey") + geom_label(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Airport", "Count"),
expand = c(.05, .05)) +
scale_fill_brewer(type = "qual", palette = "Set2") +
ggtitle("Arrival Flight Comparison")
  • 1
    You `group_by` only by`month`, i.e. my guess is that after the `summarise` there is most likely no `Airport` column in your dataset. Perhaps you could check that. – stefan Oct 28 '22 at 03:59
  • If that does not fix the issue I would suggest to provide [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including a snippet of your dataset `df_fly` via the `dput()` function, i.e. type `dput(df_fly)` into the console and copy the output starting with `structure(....` into your post. If your dataset has a lot of observations you could do e.g. `dput(head(df_fly, 10))` for the first ten rows of data. – stefan Oct 28 '22 at 04:00
  • Hi @stefan - I've updated the description and Airport is available in the dataset. Thank you. – Francis-18 Oct 28 '22 at 04:10
  • Yep. it's present in `fly`. And in your first code also in `df_fly`. But when I do `filter(fly, Type =="Arrival") %>% group_by(month = lubridate::floor_date(Date, 'month')) %>%summarize(Count = sum(Count))` I get a df with two columns: `month` and `Count`. And hence in my case a get an error `object 'Airport' not found`. – stefan Oct 28 '22 at 04:25
  • In running your code, it indeed give me the month and Count, no error. ```df_fly <- filter(fly, Airport%in% c("Brussels", "Dresden", "London"), Type =="Arrival") %>% group_by(month = lubridate::floor_date(Date, 'month')) %>%summarize(Count = sum(Count))``` when I use the df_fly in the ggplot, it consistently generate the Aesthetics error. thoughts? – Francis-18 Oct 28 '22 at 04:36
  • Yeah. That's what I mean. There is no `Airport` column in the data after the `summarise`. Hence, my guess is that you have an object (most likely a vector) called `Airport` in your global environment. And when ggplot does not find `Airport` in the data it will take this object from the global env. Hence you get the mysterious error about aes while I get an error `object not found`. – stefan Oct 28 '22 at 04:43
  • I don't see any Airport whatsoever in my global environment. in the ggplot code, y = Date, axis1 = Airport and that seems to where the error is referring to **(Aesthetics must be either length 1 or the same as the data (5): fill, y and axis1)**, right? is the (5) relevant here like it's expecting 5 values or? -- sorry, i'm not too familiar with this yet. – Francis-18 Oct 28 '22 at 04:57
  • in addition: there will be 5 months - January to May in this dataset, not sure if that's where the 5 is pertaining to. – Francis-18 Oct 28 '22 at 05:21
  • Yep. The 5 in the error message means that your data aka df_fly has 5 rows and yes that is the number of months. And the error means that you mapping something on y, fill and axis1 which has a length which is different from 5. And the most likely reason is that you use something which is not part of the data, i.e. a variable from the global env, e.g. run `color <- c("red", "blue"); ggplot(mtcars, aes(hp, mpg, color = color)) + geom_point()`and you will see what I mean. – stefan Oct 28 '22 at 05:37
  • the mapping explanation sounds right however, if ggplot is looking to match 5 (months) with the rest of the dataset, then the code will never work (i guess). Any more suggestion to add on my code to work? Thanks! – Francis-18 Oct 28 '22 at 05:57
  • Well, as you want a breakdown by month and Airport yoo could do `... %>% group_by(Airport, month = lubridate::month(Date)) %>% summarize(Count = sum(Count))`. And not sure what you want to show with the alluvial plot. But after doing so I would use `axis2=month`. – stefan Oct 28 '22 at 06:06

1 Answers1

0

So its gonna be a bit hard to debug without a reprex. My first suggestion would be to try format your code so its a bit easier to read, vertical space is free! I had a go rewriting it how I would below.

I think the main issue is the group_by() call, you want to group by each combination of airport and month, so it should be group_by(Airport, month). Without grouping by the Airport as well, you are missing the airport column after summarising, hence the error in ggplot as it cannot find the airport column. Let me know if the code below works:

df_fly = fly %>%
    filter(
        Airport %in% c("Brussels", "Dresden", "London"),
        Type == "Arrival"
        ) %>%
    group_by(
        Airport,
        month = lubridate::floor_date(Date, 'month'),
        ) %>%
    summarize(Count = sum(Count))

df_fly %>%
    ggplot(aes(y = Count, axis1 = Airport, axis2 = Count)) +
    geom_alluvium(
        aes(fill = Airport),
        width = 1/8
        ) +
    geom_stratum(
        width = 1/8,
        fill = "black",
        color = "grey"
        ) +
    geom_label(
        stat = "stratum",
        aes(label = after_stat(stratum))
        ) +
    scale_x_discrete(
        limits = c("Airport", "Count"),
        expand = c(.05, .05)) +
    scale_fill_brewer(
        type = "qual",
        palette = "Set2"
        ) +
    ggtitle("Arrival Flight Comparison")
SpikyClip
  • 154
  • 10