0

I have a dataset with the following format:

descriptors
1: D112_M|D40_M|D70_M|D107_M|D152_M|D116_M|D190_M|D62_M|D71_M|D182_M|
2: D17_P|D21_P|D23_P|D25_P|D30_P|D22_P|D37_P|D39_P|D44_P
3: D17_P|D21_P|D23_P|D25_P|D30_P|D22_P|D37_P|D39_P|
4: D17_P|D21_P|D23_P|D25_P|D30_P|D22_P
5: D112_M|D40_M|D70_M|D107_M|D152_M|D116_M|
6: D112_M|D40_M|D70_M|D107_M|D152_M|D116_M
hit_descriptors
1: 1|0|1|1|1|1|0|0|0|0
2: 0|0|0|0|0|0|1|1|2
3: 0|1|1|0|0|1|3|0
4: 1|2|1|1|2|2
5: 0|1|1|0|0|0
6: 0|1|1|1|0|1

It is necessary to transform the column descriptors into columns and the variable descriptors_acertos into values ​​of these columns. How can I do this transformation?

I tried a function:

t2_hit <- str_split(t2$descriptores_hit,
                       pattern = "|",
                       n = str_count(t2$hit_descriptors, pattern = ""),
                       simplify = T)
t2_hit <- as.data.table(t2_hit)

But give error.

  • 3
    Your data structure is ambiguous, as there are often different structures in R that can print out the same way. Can you please edit your question to include the output of running `dput(head(t2))`? – Jon Spring May 16 '23 at 03:07
  • 2
    You mention `t2$descriptors_hit` but your example does not include a variable with that name. – Jon Spring May 16 '23 at 03:08
  • Without a sample from dataset it's just a guess, but this could be a data import issue. Or something that could be resolved by reviewing data import (hint : we have a full control over used delimiters). If the source is csv / tsv file, perhaps include few lines from there too. – margusl May 16 '23 at 06:53

1 Answers1

0

I'd try a data frame

    df = as.data.frame ( hit_descriptors[1,])
    colnames(df)<- descriptors[1,] 
    l = length(descriptors[1,] )
    for (i in 2: 6){
        df<- cbind(df,descriptors[i,]  )
        colnames(df)[l:length(descriptors[1,] )] <- descriptors[i,] 
        l = l + length(descriptors[1,] )
   }

Something like that should work

Candy
  • 1
  • 4
  • Welcome to SO bush rat. Your answer may very well solve the OP's question, but good answers provide an output of results as proof. Currently it's not possible to ensure a solution is correct as the OP's sample data are ambiguous and therefore the question is not a [minimal, reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). That's why others have commented on the OP requesting a sample of the original datasets. Thanks – L Tyrone May 16 '23 at 10:30
  • Thanks Leroy, I'm new to this forum, thanks for the tip. – Candy May 28 '23 at 11:55