I have created a plot of missing count for each variable using the following code:
ui = fluidPage(plotOutput("missing"))
server = function(input, output, session){
data <- reactive({
var.missing<- sapply(readData(),function(x)sum(is.na(x)))
var.missing<- var.missing[order(var.missing)]
missing.df<- data.frame(variable=names(var.missing),missing=var.missing,stringsAsFactors=FALSE)
#missing.df$variable<- factor(missing.df$variable,levels=missing.df$variable,ordered=FALSE)
})
output$missing <- renderPlot({
ggplot(data=as.data.frame(data()),aes(x=(factor(variable,levels=variable,ordered=FALSE)),y=missing)) +
geom_bar(stat="identity") + labs(x="Variables",y="Number of Missing Values") +
theme(axis.text.x=element_text(angle=45, hjust=1))
})
}
My requirement is also to create a distributed barplot of total values and missing values but I am not able to achieve it.Can you help me what I am missing in the below line of code:-
var.missing<- sapply(readData(),function(x)(sum!(is.na(x))-sum(is.na(x))))
Test Data
str(airquality)
Output
'data.frame': 153 obs. of 6 variables:
$ Ozone : int 41 36 12 18 NA 28 23 19 8 NA ...
$ Solar.R: int 190 118 149 313 NA NA 299 99 19 194 ...
$ Wind : num 7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
$ Temp : int 67 72 74 62 56 66 65 59 61 69 ...
$ Month : int 5 5 5 5 5 5 5 5 5 5 ...
$ Day : int 1 2 3 4 5 6 7 8 9 10 ...
> head(airquality)
Output
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6
Thanks,