1

Suppose I have a dataframe, df, consisting of a class of two objects, S, a set of co-ordinates associated with them, X and Y, and a value, V, that was measured there.

For example, the dataframe looks like this:

S X Y V
0 3 3 1
0 4 3 2
1 6 0 1
1 3 3 8

I would like to know the commands that allow me to group the X and Y coordinates associated with the class, S in a new binning. In this new picture, the new value of V should be the sum of the values in the bin for each class, S.

For example, suppose this co-ordinate system was initially binned between 0 and 10 in X and Y respectively. I would like to bin it between 0 and 2. This means:

  • Values from 0 < X <= 5, 0 < Y <= 5 in the old binning constitute the value 0;
  • Values from 6 < x <= 10, 6 < y <= 10 in the old binning constitute the value 1;

Edit:

For further example, considering Dataframe df:

  1. Row 1 has X = 3 and Y = 3. Since 0 < X <= 5 and 0 < Y <= 5, this falls into bin (0,0)
  2. Row 2 has X = 4 and Y = 3. Since 0 < X <= 5 and 0 < Y <= 5, this also falls into bin (0,0).
  3. Since Row 1 and 2 are observed in the same bin and are of the same class S, they are added along column V. This gives a combined row, X=0, Y=0, V = 1+2 =3

  4. Row 3 has has X = 6 and Y = 0. Since 6 < X <= 10 and 0 < Y <= 5, this falls into bin (1,0)

  5. Row 4 has has X= 3 and Y = 3. Since 0 < X <= 5 and 0 < Y <= 5, this falls into bin (0,0). However, since the element is of class S=1, It is not added to anything, since we only add between shared classes.

The output should then be:

S X Y V
0 0 0 3
0 1 0 1
1 0 0 8

What commands must I use to achieve this?

Jack Rolph
  • 587
  • 5
  • 15

1 Answers1

0

This should do the trick:

data.loc[data['X'] <= 5, 'X'] = 0
data.loc[data['X'] > 5, 'X'] = 1
data.loc[data['Y'] <= 5, 'Y'] = 0
data.loc[data['Y'] > 5, 'Y'] = 1

data = data.groupby(['S', 'X', 'Y']).sum().reset_index()

For your example the output is:

   S  X  Y  V
0  0  0  0  3
1  1  0  0  8
2  1  1  0  1

I found this answer to be helpful.