0

Given approximate Dataframe

df = pd.DataFrame({
    'Name': ['Max', 'Stefan', 'John', 'Kate', 'Walter', 'Karin', 'Julia', 'Ben', 'Spencer'],
    'Balance': [0.12, 0.03, 0.12, 0.12, 0.12, 0.03, 0.03, 0.06, 0.03]})

      Name  Balance
0      Max     0.12
1   Stefan     0.03
2     John     0.12
3     Kate     0.12
4   Walter     0.12
5    Karin     0.03
6    Julia     0.03
7      Ben     0.06
8  Spencer     0.03

The condition of the sorting is to create mini-groups by 3 rows. These rows would be: first(max value of the column), second(the value which is less than max val and bigger than min val), third(min value of the column). But the nuance is that this value(second) can be equal to max value or equal to min value or just max > average > min. Also, it's important to note that all values can be equal. So there are four different outcomes are possible.

I need to sort this Dataframe in such a way:

      Name  Balance
0      Max     0.12
1      Ben     0.06
2    Julia     0.03
3     Kate     0.12
4   Walter     0.12
5    Karin     0.03
6     John     0.12
7   Stefan     0.03
8  Spencer     0.03
Anton
  • 3
  • 2
  • That's not really sorting if you're changing the data. Or is it an mistake that the output has one of the 0.03's replaced with a 0.12? You can [edit] to fix it or clarify. – wjandrea Mar 29 '23 at 00:28
  • BTW, welcome to Stack Overflow! Check out the [tour], and [How to ask a good question](/help/how-to-ask) if you want tips. You might also want to read [How to make good reproducible pandas examples](/q/20109391/4518341). – wjandrea Mar 29 '23 at 00:28
  • Oh, I just noticed, the code also doesn't match the output – wjandrea Mar 29 '23 at 00:33
  • 1
    @wjandrea sorry, my bad. Now it's correct – Anton Mar 29 '23 at 00:34
  • Why did John, Kate, and Walter change positions like that? – wjandrea Mar 29 '23 at 00:41

4 Answers4

1

You can sort by balance first, make an additional column for the order and use it to reorder the dataframe:

df.sort_values('Balance', ascending=False, inplace=True)

df['Order'] = sum((list(range(i, len(df), 3)) for i in range(3)), [])

df.sort_values('Order', inplace=True)

del df['Order']

Result:

      Name  Balance
0      Max     0.12
4   Walter     0.12
5    Karin     0.03
2     John     0.12
7      Ben     0.06
6    Julia     0.03
3     Kate     0.12
1   Stefan     0.03
8  Spencer     0.03
Michael Butscher
  • 10,028
  • 4
  • 24
  • 25
0

There's too much nuance in your question to answer it properly, but this may give you something similar:

max_mask = df.Balance.eq(df.Balance.max())
min_mask = df.Balance.eq(df.Balance.min())
max_vals = df[max_mask]
min_vals = df[min_mask]
other = df[~max_mask & ~min_mask].sort_values(by='Balance', ascending=False)

output = pd.concat([
    out.reset_index(drop=True) for out in (max_vals, other, min_vals)
]).sort_index()

print(output)

Output:

      Name  Balance
0      Max     0.12
0      Ben     0.06
0   Stefan     0.03
1     John     0.12
1    Karin     0.03
2     Kate     0.12
2    Julia     0.03
3   Walter     0.12
3  Spencer     0.03
BeRT2me
  • 12,699
  • 2
  • 13
  • 31
0

This algorithm consider the first 3-rows group ([0, 1, 2]) then sorts the data inside, thus the second 3-rows group ([3, 4, 5]) and sorts the data, and so on... .

Ascending order is set to False.

Generalization is possible with N parameter.

Proposed script

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Name': ['Max', 'Stefan', 'John', 'Kate', 'Walter', 'Karin', 'Julia', 'Ben', 'Spencer'],
    'Balance': [0.12, 0.03, 0.12, 0.12, 0.12, 0.03, 0.03, 0.06, 0.03]})

N=3
r = (df.groupby(np.arange(len(df.index))//N, axis=0)
       .apply(lambda g: g.sort_values(by='Balance', ascending=False))
       .reset_index(drop=True)
       )

print(r)

Result

      Name  Balance
0      Max     0.12
1     John     0.12
2   Stefan     0.03
3     Kate     0.12
4   Walter     0.12
5    Karin     0.03
6      Ben     0.06
7    Julia     0.03
8  Spencer     0.03
Laurent B.
  • 1,653
  • 1
  • 7
  • 16
0

try this:

df = pd.DataFrame({
    'Name': ['Max', 'Stefan', 'John', 'Kate', 'Walter', 'Karin', 'Julia', 'Ben', 'Spencer'],
    'Balance': [0.12, 0.03, 0.12, 0.12, 0.12, 0.03, 0.03, 0.06, 0.03]})

groups = []
while not df.empty:
    max_idx, min_idx = df['Balance'].idxmax(), df['Balance'].idxmin()
    if (between_min_max := df['Balance'].between(
        *df['Balance'].loc[min_idx, max_idx], inclusive='neither')).any():

        mid_idx = between_min_max.idxmax()

    else:
        mid_idx = df['Balance'].drop(index=[max_idx, min_idx]).idxmax()
    
    group = df.loc[[max_idx, mid_idx, min_idx]]    
    data.append(group)    
    df.drop(index=group.index, inplace=True)

result = pd.concat(groups)
print(result)
>>>
      Name  Balance
0      Max     0.12
7      Ben     0.06
1   Stefan     0.03
2     John     0.12
3     Kate     0.12
5    Karin     0.03
4   Walter     0.12
8  Spencer     0.03
6    Julia     0.03
ziying35
  • 1,190
  • 3
  • 6