0

Example

Value

15   
15   
15   
4   
37   
37   
37  

There's three distinct values but 7 rows, below is what I want. Since I want to Anonymize my data. I keep getting the error "replacement has 3 rows, data has 7"

This is the code I'm using

final_df$Value <- paste("Value",seq(1:length(unique(final_df$Value))))

Value

Value 1
Value 1   
Value 1   
Value 2   
Value 3   
Value 3   
Value 3  

1 Answers1

0

create function that does the job:

anon <- function(x) {
    rl <- rle(x)$lengths
    ans<- paste("Value", rep(seq_along(rl), rl))
    return(ans)
}

call function:

anon(final_df$Value)

result:

# [1] "Value 1" "Value 1" "Value 1" "Value 2" "Value 3" "Value 3" "Value 3"

generalization:

df1 <- mtcars
df1[] <- lapply(df1, anon)
names(df1)    <- paste0("V", seq_along(names(df1)))
rownames(df1) <- NULL

df1
Andre Elrico
  • 10,956
  • 6
  • 50
  • 69
  • Thank you for the reply. This works great for columns that have values that are repeating in order. But how would this be done for a column where the values are not directly next to each other, and spread out? – Robbie Janezic Nov 20 '18 at 18:47
  • Thats much easier. Just convert to factor and then to integer. – Andre Elrico Nov 20 '18 at 22:54