0

I would like to create a column called "Region" based on values of "Community area", eg Community area of 1 =North, community area of 2=South. I want it to be like this:

Community area   Region
25               West
67               Southwest
39               South
40               South
25               West

I tried the following code but it is not helpful:

region<-function(x){if(x==c(8,32,33)){crime$Region<-"Central"} 
else if(x==c(5,6,7,21,22)){crime$Region<-"North"}
else if(x==c(1:4,9:14,76,77)){crime$Region<-"Far North Side"}
else if(x==c(15:20)){crime$Region<-"Northwest Side"}
else if(x==c(23:31)){crime$Region<-"West"}
else if(x==c(34:43,60,69)){crime$Region<-"South"}
else if(x==c(56:59,61:68)){crime$Region<-"Southwest Side"}
else if(x==c(44:55)){crime$Region<-"Far Southeast Side"}
else if(x==c(70:75)){crime$Region<-"Far Southwest Side"}
else {crime$Region<-"Other"}
}
region(crime$Community.Area)
pogibas
  • 27,303
  • 19
  • 84
  • 117
  • 2
    When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. What exactly is wrong with the code you ran? – MrFlick Mar 08 '18 at 20:26
  • 1
    Try `case_when` from the `dplyr` package. – dshkol Mar 08 '18 at 21:06

2 Answers2

0

For long expressions involving if and else if, try case_when from the package dplyr.

> set.seed(1234)
> 
> df <- data.frame(x1 = round(runif(n = 20, min = 1, max = 4), 0), stringsAsFactors = F)
> 
> df
   x1
1   1
2   3
3   3
4   3
5   4
6   3
7   1
8   2
...
20  2
> 
> df$Region <- dplyr::case_when(df$x1 == 1 ~ "North", 
+                  df$x1 == 2 ~ "South", 
+                  df$x1 == 3 ~ "East",
+                  TRUE ~ "West")
> df
   x1 Region
1   1  North
2   3   East
3   3   East
4   3   East
5   4   West
6   3   East
7   1  North
...
20  2  South
JdeMello
  • 1,708
  • 15
  • 23
  • `case_when` seems less time consuming, but i am getting 16 warnings. The resulting column has many NA values. Warning : `1: In is.na(e1) | is.na(e2) : longer object length is not a multiple of shorter object length` – Prashanth Cp Mar 08 '18 at 23:24
  • I am not sure what is the `e1` object. Maybe next time you can show us an actual example of what you are trying to achieve. It seems that you trying to perform `case_when` on mismatched data (in this case, different length). – JdeMello Mar 09 '18 at 13:20
0

One solution could be achieved in the line of OP ideas will be by modifying the region function.

  # Take one value at a time and return Region
  region<-function(x){if(x %in% c(8,32,33)){"Central"} 
    else if(x %in% c(5,6,7,21,22)){"North"}
    else if(x %in% c(1:4,9:14,76,77)){"Far North Side"}
    else if(x %in% c(15:20)){"Northwest Side"}
    else if(x %in% c(23:31)){"West"}
    else if(x %in% c(34:43,60,69)){"South"}
    else if(x %in% c(56:59,61:68)){"Southwest Side"}
    else if(x %in% c(44:55)){"Far Southeast Side"}
    else if(x %in% c(70:75)){"Far Southwest Side"}
    else {"Other"}
  }

# Use mapply to pass each value of `Community_area` to find region as
df$Region <- mapply(region, df$Community_area)

df
#  Community_Area         Region
#1             25           West
#2             67 Southwest Side
#3             39          South
#4             40          South
#5             25           West

Data

df <- data.frame(Community_Area = c(25, 67, 39, 40, 25))
MKR
  • 19,739
  • 4
  • 23
  • 33