Create dataframe with np.column_stack when df of only 1 row

Question

I'm following on this thread and the accepted answer: Best way to join / merge by range in pandas

I'm applying the same logic dynamically on several dataframes to join 2 df based on some criterias and range values between the 2 df

A = pd.DataFrame(dict(
        A_id=range(10),
        A_value=range(5, 105, 10)
    ))
B = pd.DataFrame(dict(
        B_id=range(5),
        B_low=[0, 30, 30, 46, 84],
        B_high=[10, 40, 50, 54, 84]
    ))

A

   A_id  A_value
0     0        5
1     1       15
2     2       25
3     3       35
4     4       45
5     5       55
6     6       65
7     7       75
8     8       85
9     9       95

B

   B_high  B_id  B_low
0      10     0      0
1      40     1     30
2      50     2     30
3      54     3     46
4      84     4     84

a = A.A_value.values
bh = B.B_high.values
bl = B.B_low.values

i, j = np.where((a[:, None] >= bl) & (a[:, None] <= bh))

pd.DataFrame(
    np.column_stack([A.values[i], B.values[j]]),
    columns=A.columns.append(B.columns)

Now, sometimes, my A df only has one row, and I need to add to the valid rows of my B df the row in A df. I have no criteria to merge them because my A df has all its row (i.e 1) that matches between the 2 df.

I don't manage to create my output df with the same logic as in the thread.

How could I do to merge a df of 1 line with only the lines where the criteria in my np.where of another df are met ?

A simplified example, I have:

A = pd.DataFrame(dict(
        A_id=[0],
        A_value=123
    ))
B = pd.DataFrame(dict(
        B_id=range(5),
        B_low=[0, 30, 30, 46, 84],
        B_high=[10, 40, 50, 54, 84]
    ))

A

   A_id  A_value
0     0        123

B

   B_high  B_id  B_low
0      10     0      0
1      40     1     30
2      50     2     30
3      54     3     46
4      84     4     84

i = np.where(B['B_low'].values == 30)

i will give me an array with the rows in B where B_low equals to 30

Now that I have the valid lines, I'm trying to do this following the same logic but to no avail:

pd.DataFrame(np.column_stack([A.values[i], B.values),
        columns=A.columns.append(B.columns))

How could I do so ?

but how can I do it so that df with single line is broadcasted to all lines of my B df ? I'm struggling here with merge/join/concat — yeye, Jun 23 '19 at 15:35

score 0 · Answer 1 · answered Jun 23 '19 at 15:52

0

I've done it another way, adding a column A_value in my B df and putting the value @ A.at[0;A_Value]

Interested to understand how it could be done following the same logic as in the mentionned topic though

answered Jun 23 '19 at 15:52

yeye

503
4
19

Create dataframe with np.column_stack when df of only 1 row

1 Answers1