0
data <- data.frame(a=c("0125-510","0125-511","0125-512","4000-000","5000-000"), b=c("100 Fake Street", "100 Fake Street", "100 Fake Street", "200 Fake Street", "300 Fake Street"))

data <- data.frame(a=c("0125-510","0125-510","0125-510","4000-000","5000-000"), b=c("100 Fake Street", "100 Fake Street", "100 Fake Street", "200 Fake Street", "300 Fake Street"))

Input: 
a          b
0125-510   100 Fake Street
0125-511   100 Fake Street
0125-512   100 Fake Street
4000-000   200 Fake Street
5000-000   300 Fake Street

Output: 
a          b
0125-510   100 Fake Street
0125-510   100 Fake Street
0125-510   100 Fake Street
4000-000   200 Fake Street
5000-000   300 Fake Street

I have a dataframe with addresses. Each address has an associated consecutive ID that is formatted as a character string in order to preserve leading zeros. For each row that has a repeated address (Column B), I want to replace all values in Column A with the first ID that appears (i.e., the lowest ID).

The IDs may not necessarily be in order, so it could be formatted like:

Input: 
a          b
0125-512   100 Fake Street
0125-511   100 Fake Street
0125-510   100 Fake Street
4000-000   200 Fake Street
5000-000   300 Fake Street

Output: 
a          b
0125-510   100 Fake Street
0125-510   100 Fake Street
0125-510   100 Fake Street
4000-000   200 Fake Street
5000-000   300 Fake Street

I'm trying to implement this using dplyr.

0 Answers0