Loops and If statements to populate home locations in R

Question

R Code: I am looking for some help sorting spatial locations to assign what amounts to a house and address in a new column. I have tried a million combos of “for, ifelse, if else, foreach, while” loops. I have several csv files, which I have appended and then added some extra columns. Specifically, I added a TRUE/FALSE 1/0 for Age==Senior to identify a new home location. The goal is to assign the same house number whenever a 1 appears and keep that house number until the next 1. Once all house numbers are assigned, I want to generate a median of all lat/long at each building location and assin that median lat/long in a new column. The median helps with my GPS jumps per building location. I'm stuck with just counting the homes. First Table:

Latitude	Longitude	Age	Spatial_id	District
5.582719	-0.1596583	Senior	1	tc01
5.582721	-0.1596585	Adult	0	tc01
5.588345	-0.1656207	Senior	1	tc01
5.588341	-0.1656206	Adult	0	tc01
5.588342	-0.1656202	Adult	0	tc01
5.588348	-0.1656203	Child	0	tc01
5.588219	-0.1653842	Senior	1	tc01
5.588219	-0.1653842	Adult	0	tc01
5.588225	-0.1653841	Child	0	tc01
5.588226	-0.1653841	Child	0	tc01

spatial.loc <- c()
spatial.bldg <- c()
house.id = 100000

for (i in 1:nrow(merge.tc01)) {
  house.id = house.id + 1
  if(merge.tc01$spatial_id[i] == 1){
    spatial.loc <- append(spatial.loc, house.id)
    spatial.bldg <- paste0(merge.tc01$district,"_", spatial.loc)
  }
}

Post production trying to obtain.

Latitude	Longitude	Age	Spatial_id	District	spatial_bldg	spatial_x	spatial_y
5.582719	-0.1596583	Senior	1	tc01	tc01_100001	5.582720	-0.1596583
5.582721	-0.1596585	Adult	0	tc01	tc01_100001	5.582720	-0.1596583
5.588345	-0.1656207	Senior	1	tc01	tc01_100002	5.588344	-0.1656204
5.588341	-0.1656206	Adult	0	tc01	tc01_100002	5.588344	-0.1656204
5.588342	-0.1656202	Adult	0	tc01	tc01_100002	5.588344	-0.1656204
5.588348	-0.1656203	Child	0	tc01	tc01_100002	5.588344	-0.1656204
5.588219	-0.1653842	Senior	1	tc01	tc01_100003	5.588222	-0.1653841
5.588219	-0.1653842	Adult	0	tc01	tc01_100003	5.588222	-0.1653841
5.588225	-0.1653841	Child	0	tc01	tc01_100003	5.588222	-0.1653841
5.588226	-0.1653841	Child	0	tc01	tc01_100003	5.588222	-0.1653841

Thanks for any help you guys have.

This is getting a little closer. I can count the correct number of 1, but it doesnt fill, instead repeats the index.

spatial.loc <- c()
spatial.bldg <- c()
house.id = 100000

for (i in 1:nrow(merge.tc01)) {
  house.id = house.id + 1
  if(merge.tc01$spatial_id[i] == 1){
    spatial.loc <- append(spatial.loc, house.id)
    spatial.bldg <- paste0(merge.tc01$district,"_", spatial.loc)
    while (merge.tc01$spatial_id == 0) {
      spatial.loc <- append(spatial.loc)
      spatial.bldg <- paste0(merge.tc01$district,"_", spatial.loc)
    }
  }
}

spatial.loc
spatial.bldg

Please add a tag to your question for the language you are using. — Bohemian, Feb 25 '21 at 22:19
Can you make your dataset reproducible? https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — william3031, Feb 26 '21 at 03:55
https://github.com/tnewton2/Rproject_tnewton2/issues/1#issue-817391346 — Travis, Feb 26 '21 at 14:16

jay.sf · Accepted Answer · 2021-02-26T21:22:49.603

1

You may look up where the differences grow by 1, take the cumsum and put in sprintf.

d <- transform(d, spatial_bldg=sprintf("%s_1%05d", District, 
                                       cumsum(diff(c(0, Spatial_id)) == 1)))
d
#    Latitude  Longitude    Age Spatial_id District spatial_bldg
# 1  5.582719 -0.1596583 Senior          1     tc01  tc01_100001
# 2  5.582721 -0.1596585  Adult          0     tc01  tc01_100001
# 3  5.588345 -0.1656207 Senior          1     tc01  tc01_100002
# 4  5.588341 -0.1656206  Adult          0     tc01  tc01_100002
# 5  5.588342 -0.1656202  Adult          0     tc01  tc01_100002
# 6  5.588348 -0.1656203  Child          0     tc01  tc01_100002
# 7  5.588219 -0.1653842 Senior          1     tc01  tc01_100003
# 8  5.588219 -0.1653842  Adult          0     tc01  tc01_100003
# 9  5.588225 -0.1653841  Child          0     tc01  tc01_100003
# 10 5.588226 -0.1653841  Child          0     tc01  tc01_100003
# 11 5.587270 -0.1743943 Senior          1     tc01  tc01_100004
# 12 5.587271 -0.1743942  Adult          0     tc01  tc01_100004
# 13 5.587270 -0.1743947  Child          0     tc01  tc01_100004
# 14 5.587282 -0.1743944  Child          0     tc01  tc01_100004
# 15 5.587273 -0.1743942  Adult          0     tc01  tc01_100004
# 16 5.587273 -0.1743941  Child          0     tc01  tc01_100004

The sprintf is structured with a character vector and an arbitrary number of further objects, strings or numbers. In the character vector you may built in each of the further objects one after the other and mark their location using %d for integers and %s for strings. There are further types available which you may lookup at the help page ?sprintf.

Probably you have different districts, look into ave then.

Data:

d <- read.csv("https://github.com/tnewton2/Rproject_tnewton2/files/6050112/merge_tc01.txt")

edited Feb 26 '21 at 21:22

answered Feb 26 '21 at 18:45

jay.sf

60,139
8
53
110

This is excellent stuff @jay.sf , but I'm a little confused about it. I looked up all the tools and saw there all baseR tools. First, I can see the 05d is a way to begin a count with 5 “00000”, but not sure wah the d represents (digits or logic variable according to help doc). The second part, understanding the difference function. I don’t understand what the “x“ is doing in the difference section. However, once the difference between 0,1 is discovered, cumsum finds it equal to 1, and then sprint wraps it all together, putting it into a new column. – Travis Feb 26 '21 at 21:08
The problem is I throw an unknown error because x has no value. `> tc01_new <- transform(tc01, spatial_bldg = sprintf("%s_1%05d", District, cumsum(diff(c(0,x)) == 1))) Error in diff(c(0, x)) : object 'x' not found` – Travis Feb 26 '21 at 21:09
If I write out an x using a difference, the diff produces a negative value and does not write all 16 objects. It stops at 15. Trying to use that throws an error. – Travis Feb 26 '21 at 21:10
`x <- diff(tc01$Spatial_id, differences = 1)` \n `tc01_new <- transform(tc01, spatial_bldg = sprintf("%s_1%05d", District, cumsum(diff(c(0,x)) == 1)))` – Travis Feb 26 '21 at 21:11
`Error in sprintf("%s_1%05d", District, cumsum(diff(c(0, x)) == 1)) : arguments cannot be recycled to the same length` – Travis Feb 26 '21 at 21:12
@Travis Sorry, I subsetted Spatial_id with x for testing and didn't clean my workspace before posting, edited! Unfortunately it no longer fits in one line now:) – jay.sf Feb 26 '21 at 21:15
@Travis Also added some further explanation. – jay.sf Feb 26 '21 at 21:23
1

Sorry for the late reply, i thought i had posted a comment to say thank you @jay.sf . That did the trick :) – Travis Mar 04 '21 at 15:00
You're welcome @Travis . I actually suspected that you were satisfied and therefore lapsed into contented silence:) – jay.sf Mar 05 '21 at 08:52

Loops and If statements to populate home locations in R

1 Answers1