Dataframe > column > incorrect correlative ID

Question

import pandas as pd

file = "test.xlsx"
df = pd.read_excel(file,sheet_name="books")

for id in df['books_id']:
   print(id)

Current values

books_id
8
4
5
7
1

Expected result

books_id
1
2
3
4
5

Is there a way to do this by using df.sort_values(by='books_id', key=lambda col: ?????) or a new method is needed to do this fix sorting?

The logic is as follows: The book ID will always need to start from 1 to N. and I need to fix the previous values so they are correlative from 1,2,3,4,5,6.

if there was 1,3,5. The list should be fixed as 1,2,3.

I want to update the books_id number but not from scratch.

if the ids are 1,3,4 I need to keep 1 and update 3 to 2 and 4 to 3. making it 1,2,3 as final result.

data = {'books_id': [1, 3, 4, 5], 'Title': ['title one', 'title x', 'Title Y', 'Title Z']}  
  
df = pd.DataFrame(data)  

RESULT
     books_id  Title
    1   title one
    3   titile x
    4   Title Y
    5   Title Z

EXPECTED RESULT:

     books_id  Title
    1   title one
    2   Title Z
    3   titile x
    4   Title Y

`df.sort_values('books_id')` will indeed sort whole df by `books_id`. Is that what you're looking for or something else? — Maria K, Jul 12 '23 at 19:15
This is not sorting. The values change (7/8 and 2/3). What exactly is the logic? — mozway, Jul 12 '23 at 19:19
yeah Its more than sorting, Im trying to find a way to fix the correlative numbering. I added more context and the logic. — sys73r, Jul 12 '23 at 19:28
so do you want to re-index the books_id column, or sort the column. The difference being that sorting simply reorders the rows based on value in books_id colu7mn, while re-indexing would renumber the rows as they stand? — itprorh66, Jul 12 '23 at 19:32
`df['books_id'] = range(len(df))`? Is that what you're looking for? Maybe you want to also sort first? Please make a [mre] including some other columns and show how you want to rearrange them (or not). For specifics see [How to make good reproducible pandas examples](/q/20109391/4518341). Also please write a better title; see [ask]. — wjandrea, Jul 12 '23 at 19:42
Forgot to mention, with the output you're showing here, it looks like the column header got into the data. — wjandrea, Jul 12 '23 at 19:43

score 0 · Answer 1 · answered Jul 12 '23 at 19:40

0

If you are only trying to rename the Ids given by the excel sheet (without sorting), then I think this is what you're trying to do

df = df.reset_index()
df = df.rename(columns={"index":"books_id"})
df['books_id'] = df.index + 1

answered Jul 12 '23 at 19:40

David Sidarous

1,202
1
10
25

I edited the question adding another panda for testing. I want to update the books_id number but not from scratch. if the ids are 1,3,4 I need to keep 1 and update 3 to 2 and 4 to 3. making it 1,2,3 as final result. – sys73r Jul 12 '23 at 19:56
Doesn't the provided code fix your problem? – David Sidarous Jul 12 '23 at 20:19
I tested it and it didn't do any changes. – sys73r Jul 13 '23 at 23:49

Dataframe > column > incorrect correlative ID

1 Answers1