-1

I have found two post here, which has some partial solution of my problem. First is here and second is here.

I have a little bit different situation. I have a list of data frames with the different length, which I want to join to the one data frame regarding row names. If, some row name is not in the data frame, the column should to have NaN value.

For example I have next three data frames:

mylist[1] -> df1:

    num
a   1
b   1

mylist[2] -> df2:

    num
a   1
b   2
c   3
d   1

mylist[3] -> df3:

    num
c   1
d   1

What I want is to have the next DataFrame:

   num1   num2  num3
a  1      1     NaN
b  1      2     NaN
c  NaN    3     1
d  NaN    1     1

It means, the NaN values are on the right place and not at the bottom of the column, like in the first example. The length of all DataFrames is different and not the same like in the second example.

Community
  • 1
  • 1
Guforu
  • 3,835
  • 8
  • 33
  • 52
  • Some of this is a little hard to follow, so a reproducible example with sample data that we can copy/paste to R would be helpful. – Hack-R Jul 04 '16 at 13:09
  • @zx8754, thank you very much, it works. I really doen't realize that merge function add a new column, I though it adds new rows to the existing DF. Thank you very much – Guforu Jul 04 '16 at 13:19

3 Answers3

2

I would do it in two steps:

1) add an id-column containing the rownames:

mylist <- lapply(mylist, function(x) transform(x, id = row.names(x)))

2) merge all data.frame's by the id-column:

Reduce(function(...) merge(..., by = "id", all=TRUE), mylist)
#  id num.x num.y num
#1  a     1     1  NA
#2  b     1     2  NA
#3  c    NA     3   1
#4  d    NA     1   1

This kind of approach would be valuable if you have many items in mylist since you don't have to type each merge command manually.

talat
  • 68,970
  • 21
  • 126
  • 157
2

We can merge by row.names with all = TRUE:

# dummy data
df1 <- read.table(text = "
  num
a   1
b   1")

df2 <- read.table(text = "
  num
a   1
b   2
c   3
d   1")

df3 <- read.table(text = "
  num
c   1
d   1")


merge(
  merge(df1, df2, by = "row.names", all = TRUE),
  df3, by.x = "Row.names", by.y = "row.names", all = TRUE)

#output
#   Row.names num.x num.y num
# 1         a     1     1  NA
# 2         b     1     2  NA
# 3         c    NA     3   1
# 4         d    NA     1   1
zx8754
  • 52,746
  • 12
  • 114
  • 209
1

You can use merge to do this.

matrix(c("a","b",3,4),2,2)
df1 <- data.frame(label=c("a","b"),num1=c(1,1))
df2 <- data.frame(label=c("a","b","c","d"),num2=c(1,2,3,1))
df3 <- data.frame(label=c("c","d"),num3=c(1,1))

result <- merge(merge(df1,df2,by="label",all=TRUE),df3,by="label",all=TRUE)

> result
  label num1 num2 num3
1     a    1    1   NA
2     b    1    2   NA
3     c   NA    3    1
4     d   NA    1    1
Anton
  • 1,458
  • 1
  • 14
  • 28