1

suggested below questions don't sove my problem, because I want to add ordering based on rules. Suggested question don't answer to that. And question is not a duplicate. I have a DataFrame and I need to add a 'new column' with the order number of each value. I was able to do that, but I wonder: 1- is there a more correct/elegant way to do this? Also, is it possible: 2- to give equivalent numbers in the same order? For example in my case second and third rows have the same value, and is it possible to assign 2 for both of them? 3- to set rule for defining order for example, if difference between rows is less than 0,5 then they should be assigned the same row order. If more, then order number should increase. Thank you in advance!

np.random.seed(42)
df2=pd.DataFrame(np.random.randint(1,10, 10), columns=['numbers'])
df2=df2.sort_values('numbers')
df2['ord']=1+np.arange(0, len(df2['numbers']))

enter image description here

Zaur
  • 23
  • 7
  • 1
    Does this answer your question? [How to add a new column to an existing DataFrame?](https://stackoverflow.com/questions/12555323/how-to-add-a-new-column-to-an-existing-dataframe) – Robino Jun 26 '22 at 21:12

3 Answers3

1

If you want to use the same order number to identical "numbers", use groupby.ngroup:

df2['ord'] = df2.groupby('numbers').ngroup().add(1)

Output:

   numbers  ord
5        3    1
1        4    2
9        4    2
3        5    3
8        5    3
0        7    4
4        7    4
6        7    4
2        8    5
7        8    5

grouping with threshold

grouper = df2['numbers'].diff().gt(1).cumsum()
df2['ord_threshold'] = df2.groupby(grouper).ngroup().add(1)

Output:

   numbers  ord  ord_threshold
5        3    1              1
1        4    2              1
9        4    2              1
3        5    3              1
8        5    3              1
0        7    4              2
4        7    4              2
6        7    4              2
2        8    5              2
7        8    5              2
mozway
  • 194,879
  • 13
  • 39
  • 75
  • thanks! could you suggest a way for setting rules? for example, if values are float, could I then somehow set a rule before ordering? f.e if the difference is less than 1 it should be given the same order number, if more then add 1. – Zaur Jun 26 '22 at 21:17
  • If I understand correctly, you can then use `grouper = df2['numbers'].diff().gt(1).cumsum()` as grouper `df2.groupby(grouper).ngroup().add(1)` – mozway Jun 26 '22 at 21:19
1

you can do as well by reseting indexes:

np.random.seed(42)
df2=pd.DataFrame(np.random.randint(1,10, 10), columns=['numbers'])
df2=df2.sort_values('numbers').reset_index(drop=True)
#reset indexes
df2.reset_index(inplace=True)
#put value of new indexes (+1) in ord column
df2['ord']=df2['index']+1
#clean index column created
df2.drop(columns='index',inplace=True)

print(df2)

Result:

   numbers  ord
0        3    1
1        4    2
2        4    3
3        5    4
4        5    5
5        7    6
6        7    7
7        7    8
8        8    9
9        8   10
Renaud
  • 2,709
  • 2
  • 9
  • 24
0

Let us try

df2['ord'] = df2['numbers'].factorize()[0] + 1 
BENY
  • 317,841
  • 20
  • 164
  • 234