New column for DataFrame based on another DataFrame

Question

I want to combine "text" column with first DataFrame where B value is closest <= A value. DataFrames length is not equal.

example

a = np.array(range(10, 35, 5))
b = np.array(range(0, 30, 5)) + 2
b_text = [random.choice(string.ascii_letters) for i in range(len(b))]
df1 = pd.DataFrame(a, columns=['A'])
df2 = pd.DataFrame(list(zip(b, b_text)), columns=['B', 'text'])

score 0 · Accepted Answer · answered Jul 23 '18 at 09:07

0

I think need merge_asof:

#if problem with different dtypes
#df1['A'] = df1['A'].astype(np.int64)
#df2['B'] = df2['B'].astype(np.int64)

df = pd.merge_asof(df1, df2, left_on='A', right_on='B')
print (df)
    A   B text
0  10   7    R
1  15  12    y
2  20  17    i
3  25  22    a
4  30  27    G

answered Jul 23 '18 at 09:07

jezrael

822,522
95
1,334
1,252

Your solution did exactly what i asked, but my question was wrong. I need to use more than one column with different conditions and want to find more generic approach. Can you give me advice in which direction should i look in documentation? – typae Jul 23 '18 at 09:28
@typae - It depends of your functions, I think [`reindex`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.reindex.html) with parameter `method` should help or maybe another solution should be create helper Series by `map`, something like [this](https://stackoverflow.com/a/51415070/2901002) solution. – jezrael Jul 23 '18 at 09:33

New column for DataFrame based on another DataFrame

1 Answers1