I have got a question refering to one of my last issues: Keep upper n rows of a pandas dataframe based on condition. Using the second suggested answer my code looks like that:
import pandas as pd
x=2
y=2
df2 = pd.DataFrame({'x':[1,2,3,4], 'y':[2,2,2,2],'id':[1,2,3,4]})
df2_cut=df2[df2.index <= (df2[(df2.x == x) & (df2.y == y)]).index.tolist()]
This works fine too, it is just the same but having another dataframe there.
import pandas as pd
x=2
y=2
df1 = pd.DataFrame({'x':[2,2,2,2], 'y':[1,2,3,4],'id':[1,2,3,4]})
df2 = pd.DataFrame({'x':[1,2,3,4], 'y':[2,2,2,2],'id':[1,2,3,4]})
df1_cut=df1.iloc[((df1['x'] == x) & (df1['y'] == y)).values.argmax():]
df2_cut=df2[df2.index <= (df2[(df1.x == x) & (df2.y == y)]).index.tolist()]
Now what is weird and I do not understand is this:
import pandas as pd
x=2
y=2 #made a change here
df1 = pd.DataFrame({'x':[1,2,2,2], 'y':[1,2,3,4],'id':[1,2,3,4]})
df2 = pd.DataFrame({'x':[1,2,3,4], 'y':[2,2,2,2],'id':[1,2,3,4]})
df1_cut=df1.iloc[((df1['x'] == x) & (df1['y'] == y)).values.argmax():]
df2_cut=df2[df2.index <= (df2[(df1.x == x) & (df2.y == y)]).index.tolist()]
It is just the same Code for as before but i switched the first x
value of df1 to 1. This change throws me an error for df2, in the last row when shortening df2. How can that be?
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-121-72d9d78368b9> in <module>()
4 df1 = pd.DataFrame({'x':[1,1,1,2,6], 'y':[5,6,7,7,9],'id':[1,2,3,4,5]})
5 df2 = pd.DataFrame({'x':[1,1,4,3,1,1], 'y':[1,1,5,9,2,2],'id':[1,2,3,103,22,90]})
----> 6 df2_cut=df2[df2.index <= (df2[(df1.x == x) & (df2.y == y)]).index.tolist()]
7 df1_cut=df1.iloc[((df1['x'] == x) & (df1['y'] == y)).values.argmax():]
~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in _evaluate_compare(self, other)
3844 else:
3845 with np.errstate(all='ignore'):
-> 3846 result = op(self.values, np.asarray(other))
3847
3848 # technically we could support bool dtyped Index
ValueError: operands could not be broadcast together with shapes (6,) (0,)
Anyone an idea how that happens or any other idea of how to solve my issue?
If anyone has another idea of how to solve my issue i would be delighted to hear it. Basically I have two dataframes with x-y
coordinates and id
as columns. Now I want given a so called intersection point with an x
and y
value, to delete at dataframe 1 all rows that are under the row that fulfills the condition that x
is equal to the x-column
value and y
is equal to the y-column
value. For dataframe 2 just the same, but to delete the rows above the condition. That way I want to get 2 dataframes (one ends with the given x-y coordinates
the other starts with it). I would like to append/concat to each other to get a new dataframe. The dataframes can have duplicates and it might happen that there are jumps like from coordinate (1,2) directly to (1,4) without the intermediate step (1,3)
Here an example of what i would get for the above values:
df1_cut = pd.DataFrame({'x':[2,2], 'y':[1,2],'id':[1,2]})
df2_cut = pd.DataFrame({'x':[2,3,4], 'y':[2,2,2],'id':[2,3,4]})
df_list=[]
df_list.append(df1_cut)
df_list.append(df2_cut)
df_final = pd.concat(df_list, ignore_index=True)
I want it to work with any dataframes.