-1

I am working on a dataframe with where I have multiple columns and in one of the columns where there are many rows approx more than 1000 rows which contains the string values. Kindly check the below table for more details:

enter image description here

In the above image I want to change the string values in the column Group_Number to number by picking the values from the first column (MasterGroup) and increment by one (01) and want values to be like below:

enter image description here

Also need to verify that if the String is duplicating then instead of giving a new number it replaces with already changed number. For example in the above image ANAYSIM is duplicating and instead of giving a new sequence number I want already given number to repeating string.

Have checked different links but they are focusing on giving values from user:

Pandas DataFrame: replace all values in a column, based on condition
Change one value based on another value in pandas
Conditional Replace Pandas

Any help with achieving the desired outcome is highly appreciated.

Baig
  • 469
  • 2
  • 7
  • 19
  • Please do not share information as images unless absolutely necessary. See: https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors. – AMC Feb 04 '20 at 02:13
  • Here are the relevant resources I mentioned below: [research effort](https://meta.stackoverflow.com/questions/261592/how-much-research-effort-is-expected-of-stack-overflow-users), [ask]. – AMC Feb 04 '20 at 02:53

1 Answers1

0

We could do cumcount with groupby

s=(df.groupby('MasterGroup').cumcount()+1).mul(10).astype(str)
t=pd.to_datetime(df.Group_number, errors='coerce')

Then we assign

df.loc[t.isnull(), 'Group_number']=df.MasterGroup.astype(str)+s
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Hi Yoben, Thanks for the quick reply. I am getting error message (TypeError: can only concatenate str (not "int") to str) online: df.loc[t.isnull(), 'Group_number']=df.MasterGroup*100+s – Baig Feb 04 '20 at 00:06
  • Yoben, This is changing the existing numeric values as well, which I do not want to change. – Baig Feb 04 '20 at 00:38
  • @Baig should not change your original number to NaN , to-numeric will only return NaN , when the input can not be convert to numeric – BENY Feb 04 '20 at 00:39
  • Yoben, I am getting the below output under Group_Number column: Group_Number 2910 2920 9910 9920 8010 8020 8030 8040 8050 8060 If you see the first four (04) rows the output is changed from my original image in the post from 2901 to 2910 and so on... and also is incrementing with ten (10) not one (01) – Baig Feb 04 '20 at 00:52
  • Why do you use the `.`/dot/attribute style for column access? – AMC Feb 04 '20 at 01:21