0

I am trying to plot date wise multi-variate data along with a independent variable on the top axes. To do so - I merged my multi-variate (response variables) dataframe with the single input (independent variable) into a single data frame. The resulting dataframe now has several NA values in rows and columns (for both data sets).

My question:

  • Why am I loosing the width / dodge with my current code?
  • Does this have to do with NA values in the factored variable in my data?
  • How do you work with NA values in a factored variable ? Half my dataset is a completely different variable and needs only 1 column. The only reason I merged them was because I wanted to bring all the data on the same plot (plan was to use grob after this, but I got stuck here)

Before this, I was using this code to plot the geom_bar with the dataframe just with the response variables and it worked.

Previuosly plotted geom_bar (this is how I expect it to look)

Geom_bar plot with the merged dataframe and same code

The dataframe name is final, Factor variable is TYPE which has Open, Shrub and Lowland as categories and NA for the dates which only has the independent variable (in this case Rain)

 final$TYPE<-factor(final$TYPE, levels = c("Open", "Shrub","Lowland"))      

 limits <- aes(ymax = final$Max, ymin = final$Min, ysd= final$SD)
 rhg_cols <- c("brown","forestgreen", "cyan4")


 p <- ggplot(final, aes(Date, MeanTWC, fill=TYPE), na.rm=F )+
   geom_bar(stat="identity", position = "dodge")+
   scale_fill_manual(values = rhg_cols)+
   scale_x_date(breaks = seq(as.Date("2016-08-15"), as.Date("2017-10-15"), by="30 days"),labels=date_format("%b-%Y")) 


 p<-p + labs(x="DATE", y ="Total Water in mm")


 p<-p + geom_bar(stat = "Identity",
                 position = "dodge")+
   geom_errorbar(limits, position = "dodge", size =0.2)+ 
   ggtitle("Total Water Storage-60cm")+
   scale_y_continuous(limits = c(0,100))

p<-p+theme_bw() +theme(axis.text.x = element_text(angle = 270, vjust = 1, 

size =15),axis.text.y = element_text(vjust = 1, hjust = 1, size =20),
                            panel.grid.major.x = element_blank(),
                            panel.grid.minor.x = element_line(linetype="longdash"),
                            panel.grid.major.y = element_line(linetype = "longdash"))
     print(p)

Sample Data:

          Date    TYPE   MeanTWC       Max       Min   Rain
1   2016-08-13    <NA>        NA        NA        NA 27.686
2   2016-08-14    <NA>        NA        NA        NA 79.248
3   2016-08-15    <NA>        NA        NA        NA  9.398
4   2016-08-16    <NA>        NA        NA        NA  9.906
5   2016-08-17    <NA>        NA        NA        NA 26.670
6   2016-08-21    <NA>        NA        NA        NA 52.324
7   2016-08-27    <NA>        NA        NA        NA 13.200
8   2016-08-28    <NA>        NA        NA        NA  0.200
9   2016-08-29    <NA>        NA        NA        NA  3.000
10  2016-08-30    <NA>        NA        NA        NA  0.400
11  2016-09-02    <NA>        NA        NA        NA  5.400
12  2016-09-04    <NA>        NA        NA        NA 22.200
13  2016-09-05    <NA>        NA        NA        NA  0.400
14  2016-09-06    <NA>        NA        NA        NA  0.400
15  2016-09-11    <NA>        NA        NA        NA  0.200
16  2016-09-19    Open  82.40583  94.13074  71.95022     NA
17  2016-09-19   Shrub  75.25720  81.09062  66.31633     NA
18  2016-09-19 Lowland  79.78265  91.46637  71.42791     NA
19  2016-09-24    <NA>        NA        NA        NA  1.200
20  2016-09-28    Open 107.00762 128.82301  87.78908     NA
21  2016-09-28   Shrub 102.29717 114.59530  93.02085     NA
22  2016-09-28 Lowland 100.62097 108.65464  93.06479     NA
23  2016-10-04    Open  94.35146 119.11809  80.80844     NA
24  2016-10-04   Shrub  89.78960 106.59891  77.91514     NA
25  2016-10-04 Lowland  87.66499  98.93036  77.44905     NA
26  2016-10-07    <NA>        NA        NA        NA 15.200
27  2016-10-24    Open  77.75282  90.99799  60.89542     NA
28  2016-10-24   Shrub  73.13549  84.68082  64.38086     NA
29  2016-10-24 Lowland  77.54505  89.20983  68.77503     NA
30  2016-11-04    Open  75.79262  84.63392  61.17391     NA

 structure(list(Date = structure(c(17026, 17027, 17028, 17029, 
    17030, 17034, 17040, 17041, 17042, 17043, 17046, 17048, 17049, 
    17050, 17055, 17063, 17063, 17063, 17068, 17072, 17072, 17072, 
    17078, 17078, 17078, 17081, 17098, 17098, 17098, 17109), class = "Date"), 
        TYPE = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, 1L, 2L, 3L, NA, 1L, 2L, 3L, 1L, 2L, 3L, 
        NA, 1L, 2L, 3L, 1L), .Label = c("Open", "Shrub", "Lowland"
        ), class = "factor"), MeanTWC = c(NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, 82.4058263935714, 75.2571964744444, 
        79.782649985, NA, 107.0076241875, 102.297170442857, 100.620970785, 
        94.3514631776471, 89.7895999577778, 87.664985085, NA, 77.75281636125, 
        73.135492118, 77.54505326, 75.792624628125), Max = c(NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 94.13073642, 
        81.09062269, 91.46637475, NA, 128.8230145, 114.5952995, 108.6546353, 
        119.1180866, 106.5989092, 98.93036216, NA, 90.99798892, 84.68081807, 
        89.20983383, 84.63391564), Min = c(NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, 71.95021894, 66.31632641, 
        71.42791015, NA, 87.78907749, 93.02084587, 93.06478569, 80.8084363, 
        77.91514274, 77.44904985, NA, 60.89542395, 64.38086067, 68.77503196, 
        61.17390712), SD = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, 6.52668534466645, 5.31370742998586, 
        8.40565594980702, NA, 10.3287869191442, 8.45785409063748, 
        6.49446280465913, 9.73718805734734, 10.5575933779477, 9.35169762923353, 
        NA, 8.27219492616507, 6.75450870627616, 8.51146778459709, 
        6.75447037137946), N = c(NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, 14, 9, 4, NA, 12, 7, 4, 17, 9, 
        4, NA, 16, 10, 4, 16), SE = c(NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, 1.74433003078718, 1.77123580999529, 
        4.20282797490351, NA, 2.9816639540851, 3.19676836415673, 
        3.24723140232956, 2.36161499158505, 3.51919779264923, 4.67584881461676, 
        NA, 2.06804873154127, 2.13596319872699, 4.25573389229854, 
        1.68861759284486), Rain = c(27.686, 79.248, 9.398, 9.906, 
        26.67, 52.324, 13.2, 0.2, 3, 0.4, 5.4, 22.2, 0.4, 0.4, 0.2, 
        NA, NA, NA, 1.2, NA, NA, NA, NA, NA, NA, 15.2, NA, NA, NA, 
        NA)), .Names = c("Date", "TYPE", "MeanTWC", "Max", "Min", 
    "SD", "N", "SE", "Rain"), row.names = c(NA, 30L), class = "data.frame")
Shishir
  • 11
  • 1
  • 2
  • 1
    Good first question :) but please make sure you include your data within the example. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Michael Harper Apr 01 '18 at 20:20
  • Thanks for the comment ! I included a sample data. I hope that makes my question clearer. – Shishir Apr 01 '18 at 22:49
  • @Shishir it would be easier to use the data as you would get it from `dput(head(final, 30))`. Thus, one only had to copy it. – loki Apr 01 '18 at 22:50
  • @loki, thanks! Included the edit - both formats are included in the question now! – Shishir Apr 01 '18 at 23:01
  • Wait...what is your question? Removing NAs in plot? You only explain code results. – Parfait Apr 01 '18 at 23:50
  • @Parfait Edited it to include the specific question(s). Mainly - does it create problem to use ggplot/ geom_bar to have factored variable having NA values? – Shishir Apr 02 '18 at 00:48
  • @Shishir: your sample data mainly contain `NA`. You should run `dput(final)`, paste the output to https://pastebin.com/ then add the link to your post – Tung Apr 02 '18 at 02:48
  • Actually, the whole dataset also has na for most. It is because the other variables were recorded periodically on a few fixed dates. And rain data is a continuous dataset, which only uses 1 column (the rest = na) – Shishir Apr 02 '18 at 04:14
  • What is the purpose of including the independent variable `Rain` into the the data frame, are you planning to include it in the plot as a separate `geom` or just to incorporate those date ranges? – Djork Apr 02 '18 at 10:06
  • Yes, the purpose is to include both on the plot – Shishir Apr 02 '18 at 12:19

0 Answers0