I have an iteration/scaling problem.
I have a dataframe = geocoded It has information about 12 local areas = LA
I can subset this data and write the results of this subset to multiple files
## read data in from geocoded file
geocoded1<-read.csv("S:/somestuff/geocoded 2015 - 2018.csv",na.strings=c(""," ","N/A"))
geocoded<-subset(geocoded1,geocoded1$CONFIDENCE !="Discarded")
#split geocoded data by LA
x <-split(geocoded,list(geocoded$LA),drop = TRUE,sep = "_")
#Split geocoded data by LA and Final
#split(x, f, drop = FALSE, sep = ".", lex.order = FALSE, .)
y<-split(geocoded,list(geocoded$LA,geocoded$DISEASE), drop = TRUE, sep = "_")
#write CSV files of geocoded to file locations
lapply(names(x), function(name) write.csv(x[[name]], file = paste('S:/some stuff/LA/',name,".csv",sep="")))
lapply(names(y),function(name) write.csv(y[[name]], file = paste('S:/some stuff/LAFinal/',name,".csv",sep="")))
I can write the results of this subset to the global environment (do i need to?)
#write the results of subsetting data into x and y to the global environment
list2env(x,envir = .GlobalEnv)
list2env(y,envir = .GlobalEnv)
and i can plot as a stacked bar, in a facet wrap, for each of these data frames
# Stacked Bar Plot with Colors and Legend
bm<-ggplot(data =DATA,aes(x=MONTH,fill=FILL))+geom_bar()
bm +facet_wrap("~YEAR,ncol = 5)
And I could go through them manually (NO I CAN'T THERE'S >100 in there!).
How can I plot the contents of x or y in the same way I wrote them to a file? I used lapply there. Is there a way lapply or similar can say either: for all the names in x plot stacked bar charts faceted by year or for all the data frames in the global environment faceted by year
I was planning on plotting a stacked bar chart, faceted by year, as there are 4 years; with the same scale on the y, with the x showing the month of the year, and the fill being based on another column (gender for example). I'd like to standardise the appearance of each plot so they were transparent backgrounds.
thanks in advance
edit:
# Stacked Bar Plot with Colors and Legend
bm<-ggplot(data =LADISEASE1,aes(x=MONTH,fill=FILL))+geom_bar()
bm +facet_wrap("~YEAR,ncol = 5)
When I split by LA and DISEASE I generate 20 disease dataframes per LA (as long as they're not null, so ~200).
Edit again: Using the data from the comment
DISEASE = c("Marco Polio","Marco Polio","Marco Polio","Marco Polio","Marco Polio",
"Mumps","Mumps","Mumps","Mumps","Mumps",
"Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox")
YEAR = c(2011, 2012, 2013, 2014, 2015,
2011, 2012, 2013, 2014, 2015,
2011, 2012, 2013, 2014, 2015)
MONTH =c(1,2,3,4,5,6,7,8,9,10,11,12,1,12)
LA = c("A","B","C")
VALUE = c(82,89,79,51,51,
79,91,69,89,78,
71,69,95,61,87)
What I can do to the single dataframe is this
#split geocoded data by LA
LA <-split(geocoded,list(geocoded$LA),drop = TRUE,sep = "_")
str(LA)
Which splits the large data frame into the 12 areas + missing.
I guess the problem I'm trying (and failing at describing) to solve is how to create a panel of 20 timeline charts for each LA for each year. For example area a, infections 1-20, for each year between 2015-2018.
Do I facet the charts on year and infection, or slice the data frame first and then facet the chart?
The example shown is great! And it made me think, I should do that to. So that a person could quickly see the number of cases per year.
It's so easy to slice a data frame and make new ones, I got carried away a bit. All I need to do is work on the one data frame but output the charts as a graphic I can paste/write to a document.