1

I'm making a wind rose plot for wind speed/direction data, and found a very cool custom function.

[https://stackoverflow.com/questions/17266780/wind-rose-with-ggplot-r/17266781#17266781]

Everything runs smoothly, but when the plot comes up, there is a chunk of NA data that shows up. There are no NAs in the dataframe, and I can't think of why there is an NA label with so much data in it...

Thank you!

Code below:

plot.windrose <- function(data,
                          spd,
                          dir,
                          spdres = 2,
                          dirres = 22.5,
                          spdmin = 0,
                          spdmax = 18,
                          spdseq = NULL,
                          palette = "YlGnBu",
                          countmax = NA,
                          debug = 0){
  
  
  # Look to see what data was passed in to the function
  if (is.numeric(spd) & is.numeric(dir)){
    # assume that we've been given vectors of the speed and direction vectors
    data <- data.frame(spd = spd,
                       dir = dir)
    spd = "spd"
    dir = "dir"
  } else if (exists("data")){
    # Assume that we've been given a data frame, and the name of the speed 
    # and direction columns. This is the format we want for later use.    
  }  
  
  # Tidy up input data ----
  n.in <- NROW(data)
  dnu <- (is.na(data[[spd]]) | is.na(data[[dir]]))
  data[[spd]][dnu] <- NA
  data[[dir]][dnu] <- NA
  
  # figure out the wind speed bins ----
  if (missing(spdseq)){
    spdseq <- seq(spdmin,spdmax,spdres)
  } else {
    if (debug >0){
      cat("Using custom speed bins \n")
    }
  }
  # get some information about the number of bins, etc.
  n.spd.seq <- length(spdseq)
  n.colors.in.range <- n.spd.seq - 1
  
  # create the color map
  spd.colors <- colorRampPalette(brewer.pal(min(max(3,
                                                    n.colors.in.range),
                                                min(9,
                                                    n.colors.in.range)),                                               
                                            palette))(n.colors.in.range)
  
  if (max(data[[spd]],na.rm = TRUE) > spdmax){    
    spd.breaks <- c(spdseq,
                    max(data[[spd]],na.rm = TRUE))
    spd.labels <- c(paste(c(spdseq[1:n.spd.seq-1]),
                          '-',
                          c(spdseq[2:n.spd.seq])),
                    paste(spdmax,
                          "-",
                          max(data[[spd]],na.rm = TRUE)))
    spd.colors <- c(spd.colors, "grey50")
  } else{
    spd.breaks <- spdseq
    spd.labels <- paste(c(spdseq[1:n.spd.seq-1]),
                        '-',
                        c(spdseq[2:n.spd.seq]))    
  }
  data$spd.binned <- cut(x = data[[spd]],
                         breaks = spd.breaks,
                         labels = spd.labels,
                         ordered_result = TRUE)
  
  # figure out the wind direction bins
  dir.breaks <- c(-dirres/2,
                  seq(dirres/2, 360-dirres/2, by = dirres),
                  360+dirres/2)  
  dir.labels <- c(paste(360-dirres/2,"-",dirres/2),
                  paste(seq(dirres/2, 360-3*dirres/2, by = dirres),
                        "-",
                        seq(3*dirres/2, 360-dirres/2, by = dirres)),
                  paste(360-dirres/2,"-",dirres/2))
  # assign each wind direction to a bin
  dir.binned <- cut(data[[dir]],
                    breaks = dir.breaks,
                    ordered_result = TRUE)
  levels(dir.binned) <- dir.labels
  data$dir.binned <- dir.binned
  
  # Run debug if required ----
  if (debug>0){    
    cat(dir.breaks,"\n")
    cat(dir.labels,"\n")
    cat(levels(dir.binned),"\n")
    
  }  
  
  # create the plot ----
  p.windrose <- ggplot(data = data,
                       aes(x = dir.binned,
                           fill = spd.binned
                           ,y = (..count..)/sum(..count..)
                       ))+
    geom_bar() + 
    scale_x_discrete(drop = FALSE,
                     labels = c("N","NNE","NE","ENE", "E", 
                                "ESE", "SE","SSE", 
                                "S","SSW", "SW","WSW", "W", 
                                "WNW","NW","NNW")) +
    coord_polar(start = -((dirres/2)/360) * 2*pi) +
    scale_fill_manual(name = "Wind Speed (m/s)", 
                      values = spd.colors,
                      drop = FALSE) +
    theme(axis.title.x = element_blank()) + #I can put in my own theme settings
    scale_y_continuous(labels = percent) + # these are lined up with North bar, showing percent of time in a bin
    ylab("Frequency")
  
  # adjust axes if required
  if (!is.na(countmax)){
    p.windrose <- p.windrose +
      ylim(c(0,countmax))
  }
  
  # print the plot
  print(p.windrose)  
  
  # return the handle to the wind rose
  return(p.windrose)
}

plot.windrose(data = SR_plaster_wind_inst1,
              spd = SR_plaster_wind_inst1$wind_speed,
              dir = SR_plaster_wind_inst1$wind_direction,
              spdmin = 0,
              spdmax = 18,
              spdres = 2,
              dirres = 22.5,
              palette = "YlGnBu")

enter image description here enter link description here

carl_tom
  • 11
  • 4

1 Answers1

0

Edit

Thanks for adding the link to your data, it makes it a lot easier to troubleshoot when you can replicate the issue. The problem appears to be caused when 'wind_speed = 0'. If you change the line spd.breaks <- spdseq to spd.breaks <- spdseq - 0.001 any wind_speed values that equal zero are included on the figure, instead of being labelled "NA", i.e.

library(tidyverse)
library(scales)
#> 
#> Attaching package: 'scales'
#> The following object is masked from 'package:purrr':
#> 
#>     discard
#> The following object is masked from 'package:readr':
#> 
#>     col_factor
library(RColorBrewer)

df <- read_csv("https://raw.githubusercontent.com/c4hendrickson/SFSU_Thesis/main/Data/SR_plaster_wind_inst1.csv")
#> Rows: 1158 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dbl (2): wind_speed, wind_direction
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

plot.windrose <- function(data,
                          spd,
                          dir,
                          spdres = 2,
                          dirres = 22.5,
                          spdmin = 0,
                          spdmax = 18,
                          spdseq = NULL,
                          palette = "YlGnBu",
                          countmax = NA,
                          debug = 0){
  
  
  # Look to see what data was passed in to the function
  if (is.numeric(spd) & is.numeric(dir)){
    # assume that we've been given vectors of the speed and direction vectors
    data <- data.frame(spd = spd,
                       dir = dir)
    spd = "spd"
    dir = "dir"
  } else if (exists("data")){
    # Assume that we've been given a data frame, and the name of the speed 
    # and direction columns. This is the format we want for later use.    
  }  
  
  # Tidy up input data ----
  n.in <- NROW(data)
  dnu <- (is.na(data[[spd]]) | is.na(data[[dir]]))
  data[[spd]][dnu] <- NA
  data[[dir]][dnu] <- NA
  
  # figure out the wind speed bins ----
  if (missing(spdseq)){
    spdseq <- seq(spdmin, spdmax, spdres)
  } else {
    if (debug > 0){
      cat("Using custom speed bins \n")
    }
  }
  # get some information about the number of bins, etc.
  n.spd.seq <- length(spdseq)
  n.colors.in.range <- n.spd.seq - 1
  
  # create the color map
  spd.colors <- colorRampPalette(brewer.pal(min(max(3,
                                                    n.colors.in.range),
                                                min(9,
                                                    n.colors.in.range)),                                               
                                            palette))(n.colors.in.range)
  
  if (max(data[[spd]], na.rm = TRUE) > spdmax){    
    spd.breaks <- c(spdseq,
                    max(data[[spd]], na.rm = TRUE))
    spd.labels <- c(paste(c(spdseq[1:n.spd.seq-1]),
                          '-',
                          c(spdseq[2:n.spd.seq])),
                    paste(spdmax,
                          "-",
                          max(data[[spd]],na.rm = TRUE)))
    spd.colors <- c(spd.colors, "grey50")
  } else{
    spd.breaks <- spdseq
    spd.labels <- paste(c(spdseq[1:n.spd.seq-1]),
                        '-',
                        c(spdseq[2:n.spd.seq]))    
  }
  data$spd.binned <- cut(x = data[[spd]],
                         breaks = spd.breaks,
                         labels = spd.labels,
                         ordered_result = TRUE)
  
  # figure out the wind direction bins
  dir.breaks <- c(-dirres/2,
                  seq(dirres/2, 360-dirres/2, by = dirres),
                  360+dirres/2)
  dir.labels <- c(paste(360-dirres/2,"-",dirres/2),
                  paste(seq(dirres/2, 360-3*dirres/2, by = dirres),
                        "-",
                        seq(3*dirres/2, 360-dirres/2, by = dirres)),
                  paste(360-dirres/2,"-",dirres/2))
  # assign each wind direction to a bin
  dir.binned <- cut(data[[dir]],
                    breaks = dir.breaks,
                    ordered_result = TRUE)
  levels(dir.binned) <- dir.labels
  data$dir.binned <- dir.binned
  
  # Run debug if required ----
  if (debug>0){    
    cat(dir.breaks,"\n")
    cat(dir.labels,"\n")
    cat(levels(dir.binned),"\n")
    
  }
  
  # create the plot ----
  p.windrose <- ggplot(data = data,
                       aes(x = dir.binned,
                           fill = spd.binned,
                           y = (..count..)/sum(..count..)
                       )) +
    geom_bar() + 
    scale_x_discrete(drop = FALSE,
                     labels = c("N","NNE","NE","ENE", "E", 
                                "ESE", "SE","SSE", 
                                "S","SSW", "SW","WSW", "W", 
                                "WNW","NW","NNW")) +
    coord_polar(start = -((dirres/2)/360) * 2*pi) +
    scale_fill_manual(name = "Wind Speed (m/s)", 
                      values = spd.colors,
                      drop = FALSE) +
    theme(axis.title.x = element_blank()) + #I can put in my own theme settings
    scale_y_continuous(labels = percent) + # these are lined up with North bar, showing percent of time in a bin
    ylab("Frequency")
  
  # adjust axes if required
  if (!is.na(countmax)){
    p.windrose <- p.windrose +
      ylim(c(0, countmax))
  }
  
  # print the plot
  print(p.windrose)  
  
  # return the handle to the wind rose
  return(p.windrose)
}

plot.windrose(data = df,
              spd = "wind_speed",
              dir = "wind_direction",
              spdmin = 0,
              spdmax = 20,
              spdres = 2,
              dirres = 22.5,
              palette = "YlGnBu")


plot.windrose <- function(data,
                          spd,
                          dir,
                          spdres = 2,
                          dirres = 22.5,
                          spdmin = 0,
                          spdmax = 18,
                          spdseq = NULL,
                          palette = "YlGnBu",
                          countmax = NA,
                          debug = 0){
  
  
  # Look to see what data was passed in to the function
  if (is.numeric(spd) & is.numeric(dir)){
    # assume that we've been given vectors of the speed and direction vectors
    data <- data.frame(spd = spd,
                       dir = dir)
    spd = "spd"
    dir = "dir"
  } else if (exists("data")){
    # Assume that we've been given a data frame, and the name of the speed 
    # and direction columns. This is the format we want for later use.    
  }  
  
  # Tidy up input data ----
  n.in <- NROW(data)
  dnu <- (is.na(data[[spd]]) | is.na(data[[dir]]))
  data[[spd]][dnu] <- NA
  data[[dir]][dnu] <- NA
  
  # figure out the wind speed bins ----
  if (missing(spdseq)){
    spdseq <- seq(spdmin, spdmax, spdres)
  } else {
    if (debug > 0){
      cat("Using custom speed bins \n")
    }
  }
  # get some information about the number of bins, etc.
  n.spd.seq <- length(spdseq)
  n.colors.in.range <- n.spd.seq - 1
  
  # create the color map
  spd.colors <- colorRampPalette(brewer.pal(min(max(3,
                                                    n.colors.in.range),
                                                min(9,
                                                    n.colors.in.range)),                                               
                                            palette))(n.colors.in.range)
  
  if (max(data[[spd]], na.rm = TRUE) > spdmax){    
    spd.breaks <- c(spdseq,
                    max(data[[spd]], na.rm = TRUE))
    spd.labels <- c(paste(c(spdseq[1:n.spd.seq-1]),
                          '-',
                          c(spdseq[2:n.spd.seq])),
                    paste(spdmax,
                          "-",
                          max(data[[spd]],na.rm = TRUE)))
    spd.colors <- c(spd.colors, "grey50")
  } else{
    spd.breaks <- spdseq - 0.001
    spd.labels <- paste(c(spdseq[1:n.spd.seq-1]),
                        '-',
                        c(spdseq[2:n.spd.seq]))    
  }
  data$spd.binned <- cut(x = data[[spd]],
                         breaks = spd.breaks,
                         labels = spd.labels,
                         ordered_result = TRUE)
  
  # figure out the wind direction bins
  dir.breaks <- c(-dirres/2,
                  seq(dirres/2, 360-dirres/2, by = dirres),
                  360+dirres/2)
  dir.labels <- c(paste(360-dirres/2,"-",dirres/2),
                  paste(seq(dirres/2, 360-3*dirres/2, by = dirres),
                        "-",
                        seq(3*dirres/2, 360-dirres/2, by = dirres)),
                  paste(360-dirres/2,"-",dirres/2))
  # assign each wind direction to a bin
  dir.binned <- cut(data[[dir]],
                    breaks = dir.breaks,
                    ordered_result = TRUE)
  levels(dir.binned) <- dir.labels
  data$dir.binned <- dir.binned
  
  # Run debug if required ----
  if (debug>0){    
    cat(dir.breaks,"\n")
    cat(dir.labels,"\n")
    cat(levels(dir.binned),"\n")
    
  }
  
  # create the plot ----
  p.windrose <- ggplot(data = data,
                       aes(x = dir.binned,
                           fill = spd.binned,
                           y = (..count..)/sum(..count..)
                       )) +
    geom_bar() + 
    scale_x_discrete(drop = FALSE,
                     labels = c("N","NNE","NE","ENE", "E", 
                                "ESE", "SE","SSE", 
                                "S","SSW", "SW","WSW", "W", 
                                "WNW","NW","NNW")) +
    coord_polar(start = -((dirres/2)/360) * 2*pi) +
    scale_fill_manual(name = "Wind Speed (m/s)", 
                      values = spd.colors,
                      drop = FALSE) +
    theme(axis.title.x = element_blank()) + #I can put in my own theme settings
    scale_y_continuous(labels = percent) + # these are lined up with North bar, showing percent of time in a bin
    ylab("Frequency")
  
  # adjust axes if required
  if (!is.na(countmax)){
    p.windrose <- p.windrose +
      ylim(c(0, countmax))
  }
  
  # print the plot
  print(p.windrose)  
  
  # return the handle to the wind rose
  return(p.windrose)
}

plot.windrose(data = df,
              spd = "wind_speed",
              dir = "wind_direction",
              spdmin = 0,
              spdmax = 20,
              spdres = 2,
              dirres = 22.5,
              palette = "YlGnBu")

Created on 2022-02-11 by the reprex package (v2.0.1)


Original answer

I suspect this is due to the number of colors in the "YlGnBu" RColorBrewer pallete (n=9). If you increase the size of the bins (i.e. spdres = 3) or use a different pallete with more colors (i.e. palette = "RdYlBl") do you see fewer NA's?

Do you have any speeds over 18m/s? I.e. if you specify spdmax = 30 and spdres = 4 do you see fewer NA's?

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46
  • thanks for getting back. I tried the `RdYlBl` color palette, but it came back with this error message `Error in brewer.pal(min(max(3, n.colors.in.range), min(9, n.colors.in.range)), : RdYlBl is not a valid palette name for brewer.pal` so it seems to not like the new palette. The max speed is 18, so that range should cover everything. No difference when I try `spdmax = 30` and `spdres = 4` – carl_tom Feb 09 '22 at 15:53
  • Hmmm... Try `palette = "Set1"`, but if that doesn't fix it I'm not sure what else to try. Are you able to share your data (i.e. edit your question and add the output from `dput(SR_plaster_wind_inst1)`)? – jared_mamrot Feb 10 '22 at 03:28
  • Very cool color palette - but no dice :/ I included a link to the data on github, I hope that works https://stackoverflow.com/questions/17266780/wind-rose-with-ggplot-r/17266781#17266781 – carl_tom Feb 10 '22 at 20:37
  • Thanks @carl_tom, I have edited my answer with a potential solution. Let me know if this fails to solve your issue and I'll have another look at it – jared_mamrot Feb 10 '22 at 23:02
  • Thank you @jared_mamrot ! figure looks great, I really appreciate it. For my understanding, does that mean all values have a -0.001? But that would make values of 0 go to -0.999, so that's not it... – carl_tom Feb 13 '22 at 23:05
  • I didn't change the values (bad practice), I changed the bins: in the original code a value of 0 wasn't included in the "0-2" bin (it was shown as "NA"), and a value of 2 was included in the 0-2 bin (but not the 2-4 bin). After this change a value of 0 does fall into the "0-2" bin and a value of 2 falls into the "2-4" bin. This is kind of like the fencepost problem as there is some ambiguity in the 'naming' of the bins (e.g. instead of "0 - 2, 2 - 4, 4 - 6" it's clearer if you use "0 - 1.999, 2 - 3.999, 4 - 5.999" or "0 - <2, 2 - <4, 4 - <6"). Does that make sense? – jared_mamrot Feb 13 '22 at 23:22
  • Makes total sense! Thank you again! – carl_tom Feb 13 '22 at 23:34
  • You're welcome, if this answer has solved your problem please see [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers) and [accept the answer](https://meta.stackexchange.com/a/5235) – jared_mamrot Feb 13 '22 at 23:36