1

I have a dictionary of pandas dataframes, each frame contains timestamps and market caps corresponding to the timestamps, the keys of which are:

coins = ['dashcoin','litecoin','dogecoin','nxt']

I would like to create a new key in the dictionary 'merge' and using the pd.merge method merge the 4 existing dataframes according to their timestamp (I want completed rows so using 'inner' join method will be appropriate.

Sample of one of the data frames:

data2['nxt'].head()
Out[214]:
timestamp   nxt_cap
0   2013-12-04  15091900
1   2013-12-05  14936300
2   2013-12-06  11237100
3   2013-12-07  7031430
4   2013-12-08  6292640

I'm currently getting a result using this code:

data2['merged'] = data2['dogecoin']

for coin in coins:
    data2['merged'] = pd.merge(left=data2['merged'],right=data2[coin], left_on='timestamp', right_on='timestamp')

but this repeats 'dogecoin' in 'merged', however if data2['merged'] is not = data2['dogecoin'] (or some similar data) then the merge function won't work as the values are non existent in 'merge'

EDIT: my desired result is create one merged dataframe seen in a new element in dictionary 'data2' (data2['merged']), containing the merged data frames from the other elements in data2

David Hancock
  • 1,063
  • 4
  • 16
  • 28

2 Answers2

1

Try replacing the generalized pd.merge() with actual named df but you must begin dataframe with at least a first one:

data2['merged'] = data2['dashcoin']

# LEAVE OUT FIRST ELEMENT
for coin in coins[1:]: 
      data2['merged'] = data2['merged'].merge(data2[coin], on='timestamp')
Parfait
  • 104,375
  • 17
  • 94
  • 125
0

Since you've already made coins a list, why not just something like

data2['merged'] = data2[coins[0]]
for coin in coins[1:]:
    data2['merged'] = pd.merge(....

Unless I'm misunderstanding, this question isn't specific to dataframes, it's just about how to write a loop when the first element has to be treated differently to the rest.

user2428107
  • 3,003
  • 3
  • 17
  • 19