2

I want to create a bar plot with Descending bars, In the plot below, due to NA being present at 2nd spot in "a1" vector, it is pushed at the last when the plot is created. However, I want the NA bar to be present at the 2nd spot only, kindly help me here as I want to achieve this without modifying my data.

library(ggplot2)
library(plotly)

a1 = c("A",NA,"B","C","D","F")
b1 = c(165,154,134,110,94,78)
a12 = data.frame(a1,b1,stringsAsFactors = FALSE)
pp1 <<- ggplot(a12 , aes(x = reorder(a1,-b1), y = b1,text=paste("User: 
<br>",a1, "<br> Days: <br>", round(b1)))) + 
geom_bar(stat = "identity", fill = "#3399ff" ) + 
    scale_y_continuous(name = "Time") + 
    scale_x_discrete(name ="Employee") 
ggplotly(pp1, tooltip="text",height = 392)

bar plot

www
  • 38,575
  • 12
  • 48
  • 84
Robert J
  • 79
  • 6
  • This looks very similar to the question from a few days back... https://stackoverflow.com/questions/47950817/issue-with-na-in-arranging-the-bars-in-a-bar-plot-in-r-and-ggplot2 – Z.Lin Dec 26 '17 at 10:26
  • @Z.Lin, The data frame there does not have null, so stringsasfactors works, in this case, the issue is NA is present so the suggested solution fails, kindly help me here. – Robert J Dec 26 '17 at 10:27
  • @Z.Lin, I want to display the bar with NA value too. – Robert J Dec 26 '17 at 10:43
  • @Z.Lin, please suggest, as I don't know a possible fix. – Robert J Dec 26 '17 at 11:27
  • I don't understand your question. If NAs are present, why does your chart say null instead of NA? And does your problem concern `plotly` / `ggplotly`, or is this only relating to the `ggplot` stage? – Z.Lin Dec 26 '17 at 11:43
  • @Z.Lin, the issue is that NA is being read as null by the ggplot, now to get rid of that, purposefully you can hard code it there, but I don't want that, I need the script to understand the NA and display it in the plot, please suggest. – Robert J Dec 26 '17 at 11:49
  • a12[2,1] = "NA", this line can solve the problem, but data modification is a bad practice, please help me to make the script read it without hard code. – Robert J Dec 26 '17 at 11:51
  • @Prem, Thanks for replying Sir, see I don't want to hard code a value in the data, I want to get the script to read NA value as a variable itself. Putting it simply, NA should be read as a variable only without hard coding the data, such that the plot appears in descending order. Thanks and kindly suggest. – Robert J Dec 26 '17 at 12:08
  • @RobertJ, have you seen this [post](https://stackoverflow.com/questions/38719788/how-to-show-null-data-in-barplot-r)? Is this what you are looking for? – mnm Dec 26 '17 at 12:57
  • @Ashish, thanks for your reply, again I see the guy is hard coding the value here, when we work at enterprise solutions, hard coding "NA" in order to ease your task is a considered a bad practice. I want that NA to be considered as a normal variable and not an exception, please suggest. Thanks – Robert J Dec 26 '17 at 13:04
  • @RobertJ, in that case I suggest, try converting `a1` to factor. For instance, in this code, `a12 = data.frame(a1,b1,stringsAsFactors = FALSE)`, when I check the levels for `a1`, like `levels(a12$a1)` , I see `NULL`. But when I convert, `a1` to factor, `a12$a1 <- factor(a12$a1, exclude = NULL)`, I can see the levels, `> levels(a12$a1) [1] "A" "B" "C" "D" "F" NA`. This way, I'm not hard coding anything in the data. Also the corresponding plot will show the NA as a variable. – mnm Dec 26 '17 at 13:25
  • @Ashish, I appreciate your efforts here, however, the requirement is to arrange the plots in descending order, NA should come in the 2nd spot where it originally comes in the vector "a1". I still see the NA bar at the end, kindly suggest. – Robert J Dec 26 '17 at 13:32

1 Answers1

2

In the comments, you argued that hard-coded a value to replace NA is a bad practice. I would say that hard-coded by index position is a bad idea, but automatically replace NA with a character string, such as null, is a good idea.

In the following example, the only thing I added is a12$a1[is.na(a1)] <- "null". This line detects where is NA in a12$a1 and replace it with null. The reorder based on numbers in b1 will happend later, so this approach does not require you to know the index of NA beforehand

library(ggplot2)
library(plotly)

a1 = c("A",NA,"B","C","D","F")
b1 = c(165,154,134,110,94,78)
a12 = data.frame(a1,b1,stringsAsFactors = FALSE)

# Replace the NA to be "null"
a12$a1[is.na(a1)] <- "null"

pp1 <- ggplot(a12 , aes(x = reorder(a1, -b1), y = b1,text=paste("User: 
<br>",a1, "<br> Days: <br>", round(b1)))) + 
  geom_bar(stat = "identity", fill = "#3399ff" ) + 
  scale_y_continuous(name ="Time") + 
  scale_x_discrete(name ="Employee") 
ggplotly(pp1, tooltip="text",height = 392)

enter image description here

www
  • 38,575
  • 12
  • 48
  • 84
  • If I run your script till sixth line, I see again NA is being hard coded to the data. I also tried this approach and got the same result. But that is the reason why I am asking this question is that I cannot hard code the value, When we deliver solutions for the stakeholder, we do not have rights to apply change to the data anywhere, even assigning a "NA" value to a na is not a good practice, so please suggest me an approach without hard coding can I assume it to be a normal variable value. – Robert J Dec 26 '17 at 16:13
  • Like I stated in my post: Hard-coded by index position is a bad idea, but automatically replace NA with a character string is not. I fail to see why you insisted you cannot do that. `NA` is factor level is difficult to work with because when ordering factors it will go to the last one. Replacing `NA` with a character string is a logical choice. You can keep this post open to see if others can help, but I will not change my post for now. – www Dec 26 '17 at 16:37
  • ok let me give you a scenario, say I purposefully assign the value NA, and say the stakeholder changed his/her data in between and then conducts validation of my solution, that would become contradicting. Hence, I need a proper fix. – Robert J Dec 26 '17 at 16:40
  • 1
    When stakeholders changed their data, and if the rows contain `NA` changes, it does not matter because `a12$a1[is.na(a1)] <- "null"` will only change the data frame to be plotted, not their data. – www Dec 26 '17 at 16:43
  • thanks, I looked at this and tried your approach, we can basically pass this withing ggplot using replace_na command using tidyverse, that way plot gets created and no modification happens to data. I have a similary requirement here please check https://stackoverflow.com/questions/47987238/overcoming-the-issue-of-na-values-without-hard-code-to-create-plotly-charts – Robert J Dec 27 '17 at 07:14
  • 1
    I looked at the question you asked here: https://stackoverflow.com/questions/47950817/issue-with-na-in-arranging-the-bars-in-a-bar-plot-in-r-and-ggplot2/47983029#47983029 Basically I think markus and I have the same idea. `replace_na` is just another way to replace the `NA` value. If you accepted markus answer, I see no reason why you cannot accept mine. – www Dec 27 '17 at 10:21
  • I also looked at the questions you asked. Lots of them are similar questions. Basically, you are asking the similar questions over and over without doing your homework. I will not attempt to do your work in the future. – www Dec 27 '17 at 10:24