2

I have tried to find a solution via similar topics, but haven't found anything suitable. This may be due to the search terms I have used. If I have missed something, please accept my apologies

I have two dataframes un_1 and ets_1. They have already the same structure in terms of the columns. The difference between them is that they have different year ranges (un_1 = 1990:2016; ets_1 = 2005:2017) and some countries are also different.

What I want to do is to create a merged dataset energy where the data from both datasets will be filled in. This has to be understood as filling in the contents of un_1$UNemissions into energy and same with ets_1$ETSemissions. The column structure in energy will be the same as in the other two dataframes.

Here is an excerpt of the data:

un_1

 country iso2 year     sector  UNemissions ETSemissions
 Austria   AT 1990 1 - Energy  14025035.91          NaN
 Austria   AT 1991 1 - Energy  14791166.44          NaN
 Austria   AT 1992 1 - Energy  11581927.50          NaN
 Austria   AT 1993 1 - Energy  11623249.52          NaN
 Austria   AT 1994 1 - Energy  11915320.45          NaN
 Austria   AT 1995 1 - Energy  13044941.45          NaN
 Austria   AT 1996 1 - Energy  14048250.53          NaN
 Austria   AT 1997 1 - Energy  14077003.80          NaN
 Austria   AT 1998 1 - Energy  13106015.03          NaN
 Austria   AT 1999 1 - Energy  12548768.26          NaN
 Austria   AT 2000 1 - Energy  12263794.33          NaN
 Austria   AT 2001 1 - Energy  13770416.74          NaN
 Austria   AT 2002 1 - Energy  13380097.16          NaN
 Austria   AT 2003 1 - Energy  15965271.93          NaN
 Austria   AT 2004 1 - Energy  15899196.33          NaN
 Austria   AT 2005 1 - Energy  16194772.33          NaN
 Austria   AT 2006 1 - Energy  15039192.77          NaN
 Austria   AT 2007 1 - Energy  13757091.05          NaN
 Austria   AT 2008 1 - Energy  13582006.99          NaN
 Austria   AT 2009 1 - Energy  12526267.29          NaN
 Austria   AT 2010 1 - Energy  13852187.50          NaN
 Austria   AT 2011 1 - Energy  13666544.68          NaN
 Austria   AT 2012 1 - Energy  12256272.25          NaN
 Austria   AT 2013 1 - Energy  11224625.46          NaN
 Austria   AT 2014 1 - Energy   9499544.19          NaN
 Austria   AT 2015 1 - Energy  10623550.19          NaN
 Austria   AT 2016 1 - Energy  10448925.88          NaN
 Belgium   BE 1990 1 - Energy  29859360.87          NaN
 Belgium   BE 1991 1 - Energy  30491531.89          NaN
 Belgium   BE 1992 1 - Energy  29289874.38          NaN
 Belgium   BE 1993 1 - Energy  28769050.88          NaN
 Belgium   BE 1994 1 - Energy  29867955.59          NaN
 Belgium   BE 1995 1 - Energy  29386218.06          NaN
 Belgium   BE 1996 1 - Energy  28658131.35          NaN
 Belgium   BE 1997 1 - Energy  27609157.78          NaN
 Belgium   BE 1998 1 - Energy  30340887.77          NaN
 Belgium   BE 1999 1 - Energy  26555203.53          NaN
 Belgium   BE 2000 1 - Energy  28425730.95          NaN
 Belgium   BE 2001 1 - Energy  26382223.52          NaN
 Belgium   BE 2002 1 - Energy  27819402.95          NaN
 Belgium   BE 2003 1 - Energy  28954615.63          NaN
 Belgium   BE 2004 1 - Energy  29442709.72          NaN
 Belgium   BE 2005 1 - Energy  29246990.16          NaN
 Belgium   BE 2006 1 - Energy  28136794.10          NaN
 Belgium   BE 2007 1 - Energy  27435553.32          NaN
 Belgium   BE 2008 1 - Energy  25344134.83          NaN
 Belgium   BE 2009 1 - Energy  25744709.35          NaN
 Belgium   BE 2010 1 - Energy  26341043.76          NaN
 Belgium   BE 2011 1 - Energy  22921875.41          NaN
 Belgium   BE 2012 1 - Energy  22809482.09          NaN
 Belgium   BE 2013 1 - Energy  21242431.53          NaN
 Belgium   BE 2014 1 - Energy  20375966.00          NaN
 Belgium   BE 2015 1 - Energy  21091059.19          NaN
 Belgium   BE 2016 1 - Energy  19792162.61          NaN 

ets_1

 country iso2 year     sector UNemissions ETSemissions
 Austria   AT 2005 1 - Energy         NaN     16539659
 Austria   AT 2006 1 - Energy         NaN     15275065
 Austria   AT 2007 1 - Energy         NaN     14124646
 Austria   AT 2008 1 - Energy         NaN     14572511
 Austria   AT 2009 1 - Energy         NaN     12767555
 Austria   AT 2010 1 - Energy         NaN     15506112
 Austria   AT 2011 1 - Energy         NaN     15131551
 Austria   AT 2012 1 - Energy         NaN     13121434
 Austria   AT 2013 1 - Energy         NaN      8074514
 Austria   AT 2014 1 - Energy         NaN      6426135
 Austria   AT 2015 1 - Energy         NaN      7514263
 Austria   AT 2016 1 - Energy         NaN      7142937
 Austria   AT 2017 1 - Energy         NaN      7795277
 Belgium   BE 2005 1 - Energy         NaN     25460856
 Belgium   BE 2006 1 - Energy         NaN     24099282
 Belgium   BE 2007 1 - Energy         NaN     23706084
 Belgium   BE 2008 1 - Energy         NaN     23166180
 Belgium   BE 2009 1 - Energy         NaN     21185552
 Belgium   BE 2010 1 - Energy         NaN     22073616
 Belgium   BE 2011 1 - Energy         NaN     18950876
 Belgium   BE 2012 1 - Energy         NaN     17463388
 Belgium   BE 2013 1 - Energy         NaN     16728267
 Belgium   BE 2014 1 - Energy         NaN     15230243
 Belgium   BE 2015 1 - Energy         NaN     16053800
 Belgium   BE 2016 1 - Energy         NaN     15027777
 Belgium   BE 2017 1 - Energy         NaN     15093036

I have tried energy <- merge(un_1, ets_1), but this only creates a new dateframe with the 6 columns and zero observations.

I've also tried rbind, but this only adds the data from one dataframe to the bottom of the other one.

both emissions columns in both un_1$UNemissions and ets_1$ETSemissions are numeric.

How energy should look like (an example for one country should do)

 country iso2 year     sector  UNemissions ETSemissions
 Austria   AT 1990 1 - Energy  14025035.91          NaN
 Austria   AT 1991 1 - Energy  14791166.44          NaN
 Austria   AT 1992 1 - Energy  11581927.50          NaN
 Austria   AT 1993 1 - Energy  11623249.52          NaN
 Austria   AT 1994 1 - Energy  11915320.45          NaN
 Austria   AT 1995 1 - Energy  13044941.45          NaN
 Austria   AT 1996 1 - Energy  14048250.53          NaN
 Austria   AT 1997 1 - Energy  14077003.80          NaN
 Austria   AT 1998 1 - Energy  13106015.03          NaN
 Austria   AT 1999 1 - Energy  12548768.26          NaN
 Austria   AT 2000 1 - Energy  12263794.33          NaN
 Austria   AT 2001 1 - Energy  13770416.74          NaN
 Austria   AT 2002 1 - Energy  13380097.16          NaN
 Austria   AT 2003 1 - Energy  15965271.93          NaN
 Austria   AT 2004 1 - Energy  15899196.33          NaN
 Austria   AT 2005 1 - Energy  16194772.33          16539659
 Austria   AT 2006 1 - Energy  15039192.77          15275065
 Austria   AT 2007 1 - Energy  13757091.05          14124646
 Austria   AT 2008 1 - Energy  13582006.99          14572511
 Austria   AT 2009 1 - Energy  12526267.29          12767555
 Austria   AT 2010 1 - Energy  13852187.50          15506112
 Austria   AT 2011 1 - Energy  13666544.68          15131551
 Austria   AT 2012 1 - Energy  12256272.25          13121434
 Austria   AT 2013 1 - Energy  11224625.46          8074514
 Austria   AT 2014 1 - Energy   9499544.19          6426135
 Austria   AT 2015 1 - Energy  10623550.19          7514263
 Austria   AT 2016 1 - Energy  10448925.88          7142937
 Austria   AT 2017 1 - Energy         NaN           7795277  

Thank you very much for you help!!

Best,

Constantin

Constantin
  • 23
  • 1
  • 4
  • Possible duplicate of [How to join (merge) data frames (inner, outer, left, right)?](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) – Wimpel Dec 19 '18 at 17:56

2 Answers2

0

I would use the dplyr package for this:

library(dplyr)

energy <- un_1 %>% 
  select(-ETSemissions) %>% 
  full_join(ets_1 %>% 
              select(-UNemissions))

In the above, we take un_1 and remove the empty column ETSemissions with select. Next, we use full_join to combine with ets_1, but before we do, we remove UNemissions with select

Dave Gruenewald
  • 5,329
  • 1
  • 23
  • 35
  • Thank you for this. I like the approach especially due to the short code, but unfortunately it does not add the 2017 values which are only in `ets_1` – Constantin Dec 19 '18 at 18:35
0

Consider merging with a full join by indicator fields using by and all arguments of merge. Then, fill in missings for the emissions columns. All can be handled in the within() context:

# MERGE ON INDICATORS (FULL OUTER JOIN)
merge_df <- merge(un_1, ets_1, by=c("country", "iso2", "year", "sector"), 
                  all=TRUE, suffixes=c("", "_"))

final_df <- within(merge_df, {
                   # FILL IN MISSINGS WITH UNDERSCORE COLS
                   UNemissions <- ifelse(is.na(UNemissions), UNemissions_, UNemissions)
                   ETSemissions <- ifelse(is.na(ETSemissions), ETSemissions_, ETSemissions)

                   # REMOVE FILL-IN UNDERSCORE COLS
                   UNemissions_ <- NULL
                   ETSemissions_ <- NULL
                 })

Rextester Demo

Parfait
  • 104,375
  • 17
  • 94
  • 125