0

Update: the solution here works for my sample dataframe, but when I run it on my full set of 69 records with no null values in property_number_a and property_number_b, I get the same error cited in that solution.

Error in seq.default(from = df$property_number_a, to = df$property_number_b) : 
  'from' must be of length 1

I am trying to match addresses across two datasets (using fastLink), and the matching process is in a good place, but I received the addresses as one column, and some of the addresses have a range of numbers (property number a being the beginning and property number b being the end). I'm matching on individual addresses. The address data I received looks something like this after separating the numbers:

df <- data.frame (property_name  = c("BLUE APARTMENTS", "RED HOUSES", "CHERRY BLOSSOM RESIDENCES"),
                  property_address = c("280-282 WATER ST", "1003-1019 MILLER AVE", "970-987 SPRITE RD"),
                  property_number_a = c("280", "1003", "970"),
                  property_number_b = c("282", "1019", "987"),
                  property_street = c("WATER ST", "MILLER AVE", "SPRITE RD"))

              property_name     property_address property_number_a property_number_b property_street
1           BLUE APARTMENTS     280-282 WATER ST               280               282        WATER ST
2                RED HOUSES 1003-1019 MILLER AVE              1003              1019      MILLER AVE
3 CHERRY BLOSSOM RESIDENCES    970-987 SPRITE RD               970               987       SPRITE RD

And I want it to look something like this, such that there is a row for every possible number between property_number_a and property_number_b for each property name.

df2 <- data.frame (property_name  = c("BLUE APARTMENTS", "BLUE APARTMENTS", "BLUE APARTMENTS"),
                   property_address = c("280-282 WATER ST", "280-282 WATER ST", "280-282 WATER ST"),
                   property_number = c("280", "281", "282"),
                  property_street = c("WATER ST", "WATER ST", "WATER ST"))

    property_name property_address property_number property_street
1 BLUE APARTMENTS 280-282 WATER ST             280        WATER ST
2 BLUE APARTMENTS 280-282 WATER ST             281        WATER ST
3 BLUE APARTMENTS 280-282 WATER ST             282        WATER ST

I attempted a loop, but I'm just not familiar enough what I need to be doing to effectively troubleshoot. My next step is to give up and manually insert rows for the 69 records in excel, so help would be appreciated!

Here is what I tried:


for(i in 1:df$property_number_b-df$property_number_b {                                   # Head of for-loop
  new <- c(df[i,]$property_name,
           df[i,]$property_address,
           df[i,]$property_number,
           df[i,]$property_number_a+1),
           df[i,]$property_number_b),
           df[i,]$property_street,
           
                                  # Create new row
  df[nrow(df) + 1, ] <- new                   # Append new row
}

I get the an error saying I need to append with a list, and then I get errors pertaining to the data types. I don't think this does what I want it to anyway, however.

Erin K
  • 1
  • 2
  • ```df %>% pivot_longer(-c(property_name, property_address, property_street), values_to = "property_number", names_to = NULL, values_transform = as.numeric) %>% group_by(across(c(-property_number))) %>% reframe(property_number = full_seq(property_number, period = 1))``` – M-- May 04 '23 at 19:17

0 Answers0