0

In the following table I have a list of dates. I am trying to index the second dataframe and extract the corresponding value in the first dataframe. I want to create column D.

I have referenced Pandas Merge returns NaN as well, and both columns are indeed the same type(str)

print(data)

A          B
1/1/17     15
1/2/17     16
1/3/17     14
1/4/17     15
1/5/17     16
1/6/17     14
1/7/17     15
1/8/17     16
1/9/17     14

print(lookup)

C         D
1/4/17    15
1/7/17    15
1/8/17    16

So basically I have the 'data' pandas dataframe, but I dont know how to create a column D in the 'lookup' dataframe to populate the two 15 and the 16 values. This is a smaller example of a bigger problem, Im trying to utilize a similar reference table to create a column in a much larger dataframe that pulls off the reference table. The dates in column C clearly already exist in the lookup dataframe as they are what im trying to use as a reference.

Thanks for your help !

Frank Drin
  • 1,613
  • 2
  • 13
  • 18
  • just for clarity: do you want find all the 'B' value that match 1/4/15 and 1/7/15? – enneppi Mar 01 '18 at 21:03
  • Do you want the year (I assume these are dates) to be ignored when matching? – rahlf23 Mar 01 '18 at 21:33
  • enneppi - Yes. I would like to create column D, which is filled with the corresponding date values from column B. There are also a number of other columns on each of these dataframes, and I need to preserve all of the columns in 'lookup'. And Rahlf23 - no, I need to match the exact dates, I wrote the wrong dates and have now corrected. Thanks – Frank Drin Mar 01 '18 at 21:33
  • Thanks for the clarification, answer below. – rahlf23 Mar 01 '18 at 21:42
  • @FrankDrin I understood the matter, see answer below – enneppi Mar 02 '18 at 11:30

3 Answers3

1

I think that what you need is just a simple merge of two dataframe:

lookup.merge(data, left_on='C', right_on='A', how='left')

That's all

enneppi
  • 1,029
  • 2
  • 15
  • 33
  • Thanks for the help enneppi, but when I do this, my Lookup frame just has a blank column added. Something is breaking with the 'merge', and im not sure what as im pretty sure my 'A' and 'C' columns are the same format. Any ideas ? Thanks – Frank Drin Mar 07 '18 at 19:31
0

This should do what you want:

lookup['D'] = [data.B[data.A[data.A==i].index[0]] for i in lookup.C.tolist()]

Explanation:

Creates a new column in lookup called 'D' that is the value of column 'B' in data that corresponds to when the value in column 'A' in data is equal to the current element being iterated on in the list of values from column 'C' in lookup

rahlf23
  • 8,869
  • 4
  • 24
  • 54
  • I tried this as well as jpp's answer below. For this answer I received the error 'IndexError: index 0 is out of bounds for axis 0 with size 0'. Thanks for all the help – Frank Drin Mar 07 '18 at 19:21
0

This is one straightforward method:

lookup['D'] = lookup['C'].map(data.set_index('A')['B'])

You can also reach this result via pd.merge, but the above is likely more efficient.

Explanation

  • data.set_index('A')['B'] creates a pd.Series mapping 'A' to 'B'. This works because the indices are unique.
  • pd.Series.map can take a series as an input.
jpp
  • 159,742
  • 34
  • 281
  • 339
  • Thanks so much for your help. Unfortunately this produces a column of NaN so the lookup is failing somehow. When I run type on a value for each, they are both so im not sure why its failing... – Frank Drin Mar 07 '18 at 19:14