1

On R, I have several files ending with _tcount.tsv which is output from genomics analysis. I am following a written procedure. Running below script to read in the files.

```
ls = list.files(pattern = "_tcount.tsv")
df = do.call(cbind, Map("cbind", lapply(ls, read.delim, skip=2,header=T),sample=gsub("//..*","",ls))) 

```

Problem is from next step when I try to reshape the 'df' using select() it complains that "Can't bind data because some arguments have the same nameTraceback:"

When I look at summary(df) I get a summary output of each file as separate summary. dim(df) only show dimension of only one file worth not the entire one. Next code

dfsprd <- df %>% select(sample, Name, CoverageOnTs, ConversionOnTs) %>%
   gather(variable, value, CoverageOnTs, ConversionOnTs) %>% 
   unite(var, variable, sample) %>% 
   group_by(var) %>% mutate (id=1:n()) %>% 
   spread(var, value)

Example code I need to process is as above but I get an error as above on the first select() commend. What am I missing?


Found my mistake on the code. It should have been rbind instead of cbind and thus the error message. Thanks everyone for their responses.

YoungP
  • 31
  • 5
  • Generally reliance upon code that is throwing errors will not be an effective method to convey your intent, unless the reader can see inside your mind. It sounds from the error message as though you have managed to make an object that has duplicate column names. – IRTFM Aug 13 '19 at 02:50
  • Make sure to use `dplyr::select` – rg255 Aug 13 '19 at 02:50
  • What do you intend with `>variable`? Why the `>`? – camille Aug 13 '19 at 03:44
  • @42- Thank you for the edit and I do understand that there is a duplicate name. One I don't understand is how the do.call() is combining the files into df. I have two 30k by 6 column files with same column names. Could you explain what the resulting df would look like if you had two of these files? shouldn't you get 60k x 6 data.frame? I only get 30k x 6. – YoungP Aug 14 '19 at 10:47
  • @camille corrected the typo. – YoungP Aug 14 '19 at 10:47
  • Perhaps if you offered the output of dput(lapply(df,str)) there could be informed advice. – IRTFM Aug 15 '19 at 00:47

0 Answers0