I have a DataFrame containing 2 columns x
and y
that represent coordinates in a Cartesian system. I want to obtain groups with an even(or almost even) number of points. I was thinking about using pd.qcut()
but as far as I can tell it can be applied only to 1 column.
For example, I would like to divide the whole set of points with 4 intervals in x and 4 intervals in y (numbers might not be equal) so that I would have roughly even number of points. I expect to see 16 intervals in total (4x4).
I tried a very direct approach which obviously didn't produce the right result (look at 51 and 99 for example). Here is the code:
df['x_bin']=pd.qcut(df.x,4)
df['y_bin']=pd.qcut(df.y,4)
grouped=df.groupby([df.x_bin,df.y_bin]).count()
print(grouped)
The output:
x_bin y_bin
(7.976999999999999, 7.984] (-219.17600000000002, -219.17] 51 51
(-219.17, -219.167] 60 60
(-219.167, -219.16] 64 64
(-219.16, -219.154] 99 99
(7.984, 7.986] (-219.17600000000002, -219.17] 76 76
(-219.17, -219.167] 81 81
(-219.167, -219.16] 63 63
(-219.16, -219.154] 53 53
(7.986, 7.989] (-219.17600000000002, -219.17] 78 78
(-219.17, -219.167] 77 77
(-219.167, -219.16] 68 68
(-219.16, -219.154] 51 51
(7.989, 7.993] (-219.17600000000002, -219.17] 70 70
(-219.17, -219.167] 55 55
(-219.167, -219.16] 77 77
(-219.16, -219.154] 71 71
Am I making a mistake in thinking it is possible to do with pandas
only or am I missing something else?