0

I would like to try out a normalisation method a friend recommended, in which each col of a df should be subtracted, at first from the first col and next from every other col of that df.

eg:

df <- data.frame(replicate(9,1:4))

x_df_1 <- df[,1] - df[2:ncol(df)]
x_df_2 <- df[,2] - df[c(1, 3:ncol(df))]
x_df_3 <- df[,3] - df[c(1:2, 4:ncol(df))]
...
x_cd_ncol(df) <- df[c(1: (1-ncol(df)))]

As the df has 90 cols, doing this by hand would be terrible (and very bad coding). I am sure there must be an elegant way to solve this and to receive at the end a list containing all the dfs, but I am totally stuck how to get there. I would appreciate a dplyr method (for familiarity) but any working solution would be fine.

Thanks a lot for your help!

Sebastian

Sebastian Hesse
  • 542
  • 4
  • 16

2 Answers2

0

I may have found a solution that I am sharing here. Please correct me if im wrong.

This is a permutation without replacement task. The original df has 90 cols.

Lets check how many combinations there are possible first: (from: https://davetang.org/muse/2013/09/09/combinations-and-permutations-in-r/)

comb_with_replacement <- function(n, r){
  return( factorial(n + r - 1) / (factorial(r) * factorial(n - 1)) )
}


comb_with_replacement(90,2) #4095 combinations

Now using a modified answer from here: https://stackoverflow.com/a/16921442/10342689

(df has 90 cols. don't know how to create this proper as an example df here.)

cc_90 <- combn(colnames(df), 90)
result <- apply(cc_90, 2, function(x) df[[x[1]]]-df[[x[2]]])

dim(result) #4095

That should work.

Sebastian Hesse
  • 542
  • 4
  • 16
0

In R one can index using negative indices to represent "all except this index".
So we can re-write the first of your normalization rows:

x_df_1 <- df[,1] - df[2:ncol(df)]
# rewrite as:
x_df_1 <- df[,1] - df[,-1]

From this, it's a pretty easy next step to write a loop to generate the 90 new dataframes that you generated 'by hand':

list_of_dfs=lapply(seq_len(ncol(df)),function(x) df[,x]-df[,-x])

This seems to be somewhat different to what you're proposing in your own answer to your question, though...