0

I need to populate empty data frame with values based on values in first columns (or alternatively row names it is the same for me in this case). So here are three objects:

set.seed=11  

empty_df=data.frame(cities=c("New York","London","Rome","Vienna","Amsterdam"),
                      col.a=rep(NA,5),
                      col.b=rep(NA,5),
                      col.c=rep(NA,5))

values=rnorm(4,0,1)
to_fill=data.frame(cities=c("New York","London","Vienna","Amsterdam"),
                      col.a=values)

desired_output=data.frame(cities=c("New York","London","Rome","Vienna","Amsterdam"),
                          col.a=c(values[1],values[2],NA,values[3],values[4]),
                          col.b=rep(NA,5),
                          col.c=rep(NA,5))

First column (it can be converted to row names, both solutions using row names or first column with city name is fine) consists some cities i like to visit and other some unspecified values. First is df I want to fill with values and its output is:

     cities col.a col.b col.c
1  New York    NA    NA    NA
2    London    NA    NA    NA
3      Rome    NA    NA    NA
4    Vienna    NA    NA    NA
5 Amsterdam    NA    NA    NA

Second is object I want put INTO empty df and as you can see it is missing one row (with "Rome"):

     cities        col1
1  New York  0.55213218
2    London  0.98907729
3    Vienna  1.11703741
4 Amsterdam -0.04616725

So now I want to put this inside empty df leaving NA in row which dose not match:

     cities       col.a col.b col.c
1  New York -0.62731870    NA    NA
2    London -1.80206612    NA    NA
3      Rome          NA    NA    NA
4    Vienna -1.73446286    NA    NA
5 Amsterdam -0.05709419    NA    NA

I was trying to use simplest merge solution like this: merge(empty_df,to_fill, by="cities"):

     cities col.a.x col.b col.c     col.a.y
1 Amsterdam      NA    NA    NA -0.05709419
2    London      NA    NA    NA -1.80206612
3  New York      NA    NA    NA -0.62731870
4    Vienna      NA    NA    NA -1.73446286

And when i tried desired_output$col.a=merge(empty_df,to_fill, by="cities") error occurred(replacement has 4 rows, data has 5). Is there any simple solution to do this that can be put in for loop or apply?

Alexandros
  • 331
  • 1
  • 14

1 Answers1

1

We can use match:

empty_df$col.a <- to_fill$col.a[match(empty_df$cities, to_fill$cities)]
empty_df;
#     cities      col.a col.b col.c
#1  New York  1.5567564    NA    NA
#2    London -0.6969401    NA    NA
#3      Rome         NA    NA    NA
#4    Vienna  1.3336636    NA    NA
#5 Amsterdam  0.7329989    NA    NA

We fill col.a of empty_df with col.a values from to_fill by matching cities from empty_df with cities from to_fill.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68