Merge three datasets with different sizes, column names

Question

So basically I'm trying to merge three quite different datasets – three to be precise. I have tried using the merge() function but without any success since the different samples vary, as well as column names.

Any suggestions how to tackle my problem?

Welcome to SO, James Andersson! See https://stackoverflow.com/q/1299871/3358272, https://stackoverflow.com/q/5706437/3358272 for good discussions about merging/joining data. Realize that we don't know what you have, what you have tried, nor what you need. If those are not clear, you will need to make this question *reproducible*, please read https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. — r2evans, Sep 22 '21 at 16:10
Please provide enough code so others can better understand or reproduce the problem. — Community, Sep 24 '21 at 09:11

ivan866 · Accepted Answer · 2021-09-22T16:26:55.387

-2

what you seek for is the JOIN operation; but you need to specify columns to merge on in order to resolve ambiguities

require(data.table)
dt1 = data.table(col1=1:10, col2=5:14)
dt2 = data.table(col1=11:20, col22=LETTERS[1:10])
dt3 = data.table(col2=5:14, col3=101:110)
as.data.frame(merge(merge(dt1,dt2, all=TRUE), dt3, by='col2'))

edited Sep 22 '21 at 16:26

answered Sep 22 '21 at 16:23

ivan866

554
4
10

I can't imagine this will help the asker much. There's no explanation as to what is happening, no reason to use data.table, and although the question is so vague to be impossible to understand the problem, this misses one of the few details that are provided: that the column names vary between the data sets. – Gregor Thomas Sep 22 '21 at 16:26
I'd strongly recommend encouraging OP to clarify the question before jumping in. – Gregor Thomas Sep 22 '21 at 16:27
@GregorThomas no it doesnt miss the point about varying column names; look closer; also, there is no point in ignoring the question just because it is unclear or incomplete; 'if we were aware of what we were doing right from the start, that wouldnt be called a research' – ivan866 Sep 22 '21 at 16:28
I've looked closely and I see the name `col1` in common between one pair of data frames `col2` in common between another. In your example, contrary to your text, specifying column names does nothing -- you will get the same result whether or not you use `by = 'col2'` - you have unambiguous column names in common for each merge. – Gregor Thomas Sep 22 '21 at 16:54
Appreciate the comments. Ivan, could you please explain the code you provided me with in more detail? – James Andersson Sep 22 '21 at 17:50
@JamesAndersson no need to explain anything, it just works; just adapt it to your task and use it – ivan866 Sep 22 '21 at 20:14

Merge three datasets with different sizes, column names

1 Answers1