I have a list of thousands of dataframes, where each dataframe has a column called "x" and a column called "go". It looks like this:
> lapply(head(genes2Vecs), head)
$A1BG
x go
GO:0005576 0.13793103 GO:0005576
GO:0005615 0.05172414 GO:0005615
GO:0031982 0.24137931 GO:0031982
GO:0043227 0.25862069 GO:0043227
GO:0043230 0.05172414 GO:0043230
GO:1903561 0.03448276 GO:1903561
$A1CF
x go
GO:0005488 0.11111111 GO:0005488
GO:0097159 0.06944444 GO:0097159
GO:1901363 0.06944444 GO:1901363
GO:0003676 0.05555556 GO:0003676
GO:0003723 0.04166667 GO:0003723
GO:0006139 0.13888889 GO:0006139
$AACS
x go
GO:0008152 0.12500000 GO:0008152
GO:0044238 0.02173913 GO:0044238
GO:0071704 0.07065217 GO:0071704
GO:0003824 0.03804348 GO:0003824
GO:0016405 0.01630435 GO:0016405
GO:0016874 0.03260870 GO:0016874
$AARS2
x go
GO:0000166 0.06930693 GO:0000166
GO:0005488 0.27722772 GO:0005488
GO:0008144 0.01980198 GO:0008144
GO:0017076 0.04950495 GO:0017076
GO:0030554 0.02970297 GO:0030554
GO:0032553 0.03960396 GO:0032553
$AATK
x go
GO:0000166 0.10769231 GO:0000166
GO:0005488 0.27692308 GO:0005488
GO:0008144 0.03076923 GO:0008144
GO:0017076 0.07692308 GO:0017076
GO:0030554 0.04615385 GO:0030554
GO:0032553 0.06153846 GO:0032553
$ABAT
x go
GO:0005488 0.054644809 GO:0005488
GO:0008144 0.008196721 GO:0008144
GO:0019842 0.008196721 GO:0019842
GO:0036094 0.010928962 GO:0036094
GO:0043167 0.013661202 GO:0043167
GO:0043168 0.005464481 GO:0043168
I'd like to merge them into something like this
go A1BG A1CF AACS AARS2 AATK ABAT
1 GO:0000003 NA 0.06944444 NA NA NA 0.01639344
2 GO:0000049 NA NA NA 0.00990099 NA NA
3 GO:0000166 NA NA 0.03804348 0.06930693 0.1076923 NA
4 GO:0000959 NA NA NA 0.02970297 NA NA
5 GO:0001101 NA NA 0.01630435 NA NA NA
6 GO:0001505 NA NA NA NA NA 0.01912568
The list I have (genes2Vecs
) has a length of 2584
, so I'm expecting the resulting dataframe to have that many columns. I've tried two different techinques to merge this list, but both of them give me
- An incorrect number of columns
- Columns that aren't the names of the elements of the list
I've tried
genes <- Reduce(function(dtf1, dtf2) merge(dtf1, dtf2, by = "go", all = TRUE),
genes2Vecs)
and
genes <- genes2Vecs %>% reduce(full_join, by = "go")
Both of them give me columns named "x...." and have 2436
columns instead of 2584
I'm not sure why this is happening. The merge works fine for small subsets of the list. But for the 2584 dataframes, it seems to mess up somewhere. Can someone suggest a better way to do this, or give me an idea about why it's not merging the correct number of columns?
Thanks