-1

I want to manipulate numeric values with different data corruption methods.

Let say I have a value 30.1, how do I do the insertion (any number might be inserted in any position, e.g., 230.1,304.1..etc), deletion (e.g., 3.1, 30..etc), substitution (randomly replace with a different number, e.g., 35.1), and transposition (randomly change position of two adjacent numbers, e.g., 31.0)?

Not sure if I searched in a completely wrong direction because I couldn't find any relevant answer to my question..

  • 1
    Do you just want to do some random changes or should it be systematic in any way? Either way, the easiest approach would be transforming the numeric value to character, changing, inserting, or removing one element and then converting it back to numeric. To make it easier, you can split the character value to individual numbers in a vector. – Martin Wettstein Jul 01 '21 at 08:35
  • Oh, I'll give it a try first. Thanks! (I want to alter the value in a systematic way) – user10301352 Jul 01 '21 at 08:37

1 Answers1

1
library(tidyverse)
split_number <- function(number){
  number %>% as.character() %>% str_split("") %>% .[[1]]
}

# https://stackoverflow.com/questions/1493969/how-to-insert-elements-into-a-vector
insertion <- function(number){ 
  sn <- split_number(number)
  after_index <- sample(0:length(sn), 1)
  indices <- c(1:length(sn), after_index + .5)
  str_c(c(sn, sample(0:9, 1))[order(indices)], collapse = "") %>% as.numeric()
}

deletion <- function(number){
  sn <- split_number(number)
  drop_index <- sample(1:length(sn), 1)
  str_c(sn[-drop_index], collapse = "") %>% as.numeric()
}

substitution <- function(number){
  sn <- split_number(number)
  replace_index <- sample(1:length(sn), 1)
  sn[replace_index] <- sample(0:9, 1)
  str_c(sn, collapse = "") %>% as.numeric()
}

transposition <- function(number){
  sn <- split_number(number)
  stopifnot(length(sn) > 1)
  transpose_index <- sample(2:length(sn), 1)
  indices <- 1:length(sn)
  indices[transpose_index] <- transpose_index - 1.5
  str_c(sn[order(indices)], collapse = "") %>% as.numeric()
}

x <- 30.1
set.seed(12345)
x %>% insertion()
#> [1] 309.1
x %>% insertion()
#> [1] 30.71
x %>% insertion()
#> [1] 370.1

x %>% deletion()
#> [1] 3.1
x %>% deletion()
#> [1] 3.1
x %>% deletion()
#> [1] 301

x %>% substitution()
#> [1] 36.1
x %>% substitution()
#> [1] 39.1
x %>% substitution()
#> [1] 70.1

x %>% transposition()
#> [1] 301
x %>% transposition()
#> [1] 3.01
x %>% transposition()
#> [1] 3.01

Created on 2021-07-01 by the reprex package (v2.0.0)

Andy Eggers
  • 592
  • 2
  • 10