0

I am working in RStudio and I have certain number of scripts that corresponds to each district that our network is in.

Everytime I make an update to script1, I have to make an update to scripts2 all the way to script24.

Only difference between these scripts are

  1. working directory
  2. .csv file that is read into data frame
  3. the padding around the bbox, i.e. f value

Here is actual code of one of them

library(ggmap)
library(ggplot2)

setwd("d:/GIS/different_directory")
sep <- read.csv("district_number_SEP_assets_csv.csv")
Sub1 <- sep[grep("SEP.12", names(sep))]
sep$newCol <- 100*rowSums(Sub1)/rowSums(sep[4:7])


# create a new grouping variable
Percent_SEP12_Assets <- ifelse(sep[,8] <= 33, "Lower Third", ifelse(sep[,8] >= 66, "Upper Third", "Middle Third"))
Percent_SEP12_Assets <- factor(Percent_SEP12_Assets,
                               levels = c("Upper Third", "Middle Third", "Lower Third"))


# get the map
bbox <- make_bbox(sep$Longitude, sep$Latitude, f = varies from scripts)
map <- get_map(bbox)


# plot the map and use the grouping variable for the fill inside the aes
ggmap(map) +
  geom_point(data=sep, aes(x = Longitude, y = Latitude, color=Percent_SEP12_Assets ), size=5, alpha=0.6) +
  scale_color_manual(values=c("green","orange","red"))

There must be a more streamlined way to do this.

More Info

I determine f based on whether data points are cut off or not and keep f lowest number possible.

Changes in script1 have no effect on script2, etc. Scripts are copies of each other for each district such that if I change script1, I must change script2.

District number is hard-coded into the file name, and hard-coded into the R script.

Rhonda
  • 1,661
  • 5
  • 31
  • 65
  • Are all of your scripts always stored in the same directory (together)? – nrussell Jul 07 '15 at 18:08
  • @nrussell No, they are stored in separated directories, but I can change as needed ...... – Rhonda Jul 07 '15 at 18:09
  • Okay, and are these scripts typically run non-interactively (e.g. using `Rscript `) or interactively (e.g. in RStudio)? – nrussell Jul 07 '15 at 18:17
  • @nrussell They run interactively, i.e. I select all code and click `Run` – Rhonda Jul 07 '15 at 18:24
  • How exactly is `f`, the bbox, determined? In `read.csv("district_number_SEP_assets_csv.csv")`, is `district_number` dynamic? Why do changes in script1 have to affect those in script2; does script2 depend on script1's output, and script4 on script3's and so on? Or are all scripts just copies of each other (each working one district) such that when you change script1 you need to update all the other 23 for uniformity? – shekeine Jul 07 '15 at 19:16
  • @Shekeine I determine`f` based on whether data points are cut off or not. I keep f lowest number possible. Changes in script1 have no effect on script2, etc. District number is hard-coded into the file name, and hard-coded into the R script. Scripts are copies of each other for each district such that if I change script1, I must change script2 – Rhonda Jul 07 '15 at 19:20
  • Still on `f`, using the sample code above, where will `f` be derived from? here you just wrote `f = varies from scripts`. Where in the code above might "...points be cut off or not"? – shekeine Jul 07 '15 at 19:33
  • @Shekeine I manually add `f`. I run the program and the output says that data points were cut off I increase value of `f` til all data points are on map, i.e. `Warning message: Removed 54 rows containing missing values (geom_point)` – Rhonda Jul 07 '15 at 19:57

1 Answers1

1

Copy & paste all your csv's in one folder, say, one called mywd

#Then make this folder with all files your WD
setwd("d:/GIS/mywd")

# As you correctly noted, writing code the number of times you need to 
# run it is not much fun :-)
# `for` loops exist just for this. You write the code only once
# The `loop` then applies that code to as many inputs as you have
# in your case, district CSV's

Create a list of all district csv files

dlist <- list.files("mywd", pattern="SEP_assets_csv.csv")

Iterate your code over the list of csv files in dlist using a for loop

for(sep in dlist){

sep <- read.csv("sep")
Sub1 <- sep[grep("SEP.12", names(sep))]
sep$newCol <- 100*rowSums(Sub1)/rowSums(sep[4:7])

# create a new grouping variable
Percent_SEP12_Assets <- ifelse(sep[,8] <= 33, "Lower Third", ifelse(sep[,8] >= 66, "Upper Third", "Middle Third"))
Percent_SEP12_Assets <- factor(Percent_SEP12_Assets,
                      levels = c("Upper Third", "Middle Third", "Lower Third"))

# get the map
# Note Exclusion of the `f` argument
# Data points are cut off because x, y or Percent_SEP12_Assets are missing
# You MUST have x and y coords to show any point, so any row without x or y must be excluded since its position is not fully described
# If `Percent_SEP12_Assets` is missing, we can show it with a special colour e.g. yellow

bbox <- make_bbox(sep$Longitude, sep$Latitude)
map <- get_map(bbox)

# plot the map and use the grouping variable for the fill inside the aes
(ggmap(map) + 
geom_point(data=sep, aes(x = Longitude, y = Latitude, 
  color=Percent_SEP12_Assets, size=5, alpha=0.6) +
  scale_color_manual(values=c("green","orange","red"), na.value="yellow"))
}

Two more things, please see this and this.

Community
  • 1
  • 1
shekeine
  • 1,445
  • 10
  • 22