My objective is to take a subset of a data log that recorded airport station codes over time.
I am trying to plot a frequency table based on the number of times a station code is entered, then I would like to build a stacked bar chart, using the 'fill' function. Additionally, I am trying to divide these bases into 4 even groups.
The subset of the data looks like this:
OPSLOG2016$Base <- c("yyc", "yyc", "ylw", "yvr", "lax", "hnl", "yvr", "yow", "yyz","yyz", "lga", "yyz", "yyz", "YYZ", "yow", "YYC", "YYZ", NA, "hux","yvr", ... <truncated>
Some frequencies of some bases:
#List of 49
$ bos: num 134
$ cun: num 205
$ fll: num 114
$ hnl: num 95
$ las: num 288
$ lax: num 218
$ lga: num 456
$ lgw: num 169
$ mbj: num 71
$ mco: num 223
$ ogg: num 99
My code up to this point:
corpus = Corpus(VectorSource(OPSLOG2016$Base))
corpus = tm_map(corpus, PlainTextDocument)
basefreq = DocumentTermMatrix(corpus)
sparseBase = removeSparseTerms(basefreq, 0.999)
dfBase = as.data.frame(as.matrix(sparseBase))
qplot(dfBase, y = scale(dfBase,center = TRUE, scale = frequency())
**#Error: ggplot2 doesn't know how to deal with data of class list Error during wrapup: cannot open the connection**
dfVecSum = lapply(dfBase, sum)
plot(dfVecSum)
**#Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' is a list, but does not have components 'x' and 'y Error during wrapup: cannot open the connection**
ggplot(dfVecSum, aes(x = dfVecSum, y = Frequency, fill = fill)) +
geom_bar(position = "fill")
**#Error: ggplot2 doesn't know how to deal with data of class list Error during wrapup: cannot open the connection**
It's likely obvious that I am new to this, and am committing many errors. But I'm hoping to be put in the right direction, as I can't seem to get any of this to work on my own.