ValueError: operands could not be broadcast together with shapes (6,) (0,) when trying to select rows in dataframe based on condition

Question

I have got a question refering to one of my last issues: Keep upper n rows of a pandas dataframe based on condition. Using the second suggested answer my code looks like that:

import pandas as  pd
x=2
y=2
df2 = pd.DataFrame({'x':[1,2,3,4], 'y':[2,2,2,2],'id':[1,2,3,4]})
df2_cut=df2[df2.index <= (df2[(df2.x == x) & (df2.y == y)]).index.tolist()]

This works fine too, it is just the same but having another dataframe there.

import pandas as  pd
x=2
y=2
df1 = pd.DataFrame({'x':[2,2,2,2], 'y':[1,2,3,4],'id':[1,2,3,4]})
df2 = pd.DataFrame({'x':[1,2,3,4], 'y':[2,2,2,2],'id':[1,2,3,4]})
df1_cut=df1.iloc[((df1['x'] == x) & (df1['y'] == y)).values.argmax():]
df2_cut=df2[df2.index <= (df2[(df1.x == x) & (df2.y == y)]).index.tolist()]

Now what is weird and I do not understand is this:

import pandas as  pd
x=2
y=2                     #made a change here
df1 = pd.DataFrame({'x':[1,2,2,2], 'y':[1,2,3,4],'id':[1,2,3,4]})
df2 = pd.DataFrame({'x':[1,2,3,4], 'y':[2,2,2,2],'id':[1,2,3,4]})
df1_cut=df1.iloc[((df1['x'] == x) & (df1['y'] == y)).values.argmax():]
df2_cut=df2[df2.index <= (df2[(df1.x == x) & (df2.y == y)]).index.tolist()]

It is just the same Code for as before but i switched the first x value of df1 to 1. This change throws me an error for df2, in the last row when shortening df2. How can that be?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-121-72d9d78368b9> in <module>()
4 df1 = pd.DataFrame({'x':[1,1,1,2,6], 'y':[5,6,7,7,9],'id':[1,2,3,4,5]})
5 df2 = pd.DataFrame({'x':[1,1,4,3,1,1], 'y':[1,1,5,9,2,2],'id':[1,2,3,103,22,90]})
----> 6 df2_cut=df2[df2.index <= (df2[(df1.x == x) & (df2.y == y)]).index.tolist()]
7 df1_cut=df1.iloc[((df1['x'] == x) & (df1['y'] == y)).values.argmax():]

~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in _evaluate_compare(self, other)
3844                 else:
3845                     with np.errstate(all='ignore'):
-> 3846                         result = op(self.values, np.asarray(other))
3847 
3848                 # technically we could support bool dtyped Index

ValueError: operands could not be broadcast together with shapes (6,) (0,)

Anyone an idea how that happens or any other idea of how to solve my issue? If anyone has another idea of how to solve my issue i would be delighted to hear it. Basically I have two dataframes with x-y coordinates and id as columns. Now I want given a so called intersection point with an x and y value, to delete at dataframe 1 all rows that are under the row that fulfills the condition that x is equal to the x-columnvalue and y is equal to the y-column value. For dataframe 2 just the same, but to delete the rows above the condition. That way I want to get 2 dataframes (one ends with the given x-y coordinatesthe other starts with it). I would like to append/concat to each other to get a new dataframe. The dataframes can have duplicates and it might happen that there are jumps like from coordinate (1,2) directly to (1,4) without the intermediate step (1,3)

Here an example of what i would get for the above values:

df1_cut = pd.DataFrame({'x':[2,2], 'y':[1,2],'id':[1,2]})
df2_cut = pd.DataFrame({'x':[2,3,4], 'y':[2,2,2],'id':[2,3,4]})
df_list=[]
df_list.append(df1_cut)
df_list.append(df2_cut)
df_final = pd.concat(df_list, ignore_index=True)

I want it to work with any dataframes.

In `df2.index <= (df2[(df1.x == x) & (df2.y == y)]).index.tolist()` you're comparing `df2.index` to something that comes from a filtered dataframe `df2[(df1.x == x) & (df2.y == y)]`. The surprising part is probably that this _doesn't_ break in your first example dataframes. The former (`df2.index`) will have as many elements as the number of rows in `df2`, while `df2[(df1.x == x) & (df2.y == y)]` will have less than or equal number of rows (but in general, less). — Andras Deak -- Слава Україні, Nov 02 '18 at 18:09
damn you are absolutely right! uff it is always sth. like that — Mauritius, Nov 02 '18 at 18:11

ValueError: operands could not be broadcast together with shapes (6,) (0,) when trying to select rows in dataframe based on condition

0 Answers0

Linked