-1

I have DF as:

code_range   CCS    CCS_Label   icode
'0112-0115'  232    Anesthesia  0112, 0115
'0118-0120'  232    Anesthesia  0118, 0120 

I need to generate the numbers between two values of icode(column) and append those values to df keeping other column-values same. And delete the old rows. Somewhat like expanding rows. My excepted solution:

code_range   CCS    CCS_Label   icode
'0112-0115'  232    Anesthesia  0112
'0112-0115'  232    Anesthesia  0113
'0112-0115'  232    Anesthesia  0114
'0112-0115'  232    Anesthesia  0115
'0118-0120'  232    Anesthesia  0118
'0118-0120'  232    Anesthesia  0119
'0118-0120'  232    Anesthesia  0120 

On spending good amount of time also I am not able to find the right approach to the problem. Please help to resolve the problem.

Thanks in Advance..

  • Do you have an attempt? Moreover, if this is some sort of healthcare data, make sure your codes in mean what you think they mean. – ifly6 Aug 08 '19 at 18:41
  • I have tried with Series.between(value, value). range(value, value). None is giving the right approach to generate values and even after generating values how to append. Yes, you are right, its healthcare data. I have further more steps on getting solution...Thank you:) – Raghavendra S Aug 08 '19 at 18:46
  • 1
    can you print `df.head().to_dict()` Its that column the string `'0112, 0115'`? – ALollz Aug 08 '19 at 18:58
  • Yes, it is string as you said. Actually, icode(column) in my .csv file had string values('0112-0115' and '0118-0120'..so on). I extracted string values as (0112, 0115 and 0118, 0120..so on ). I am updating the same in the post. sorry I missed '0' infront of the values And I am updating the same.....Thank you. – Raghavendra S Aug 09 '19 at 05:55

1 Answers1

0

Not particularly performant, but split, create the list, then explode. Assumes all your codes are ints, and that you don't want the leading 0s, like in your output:

(df.set_index(['CCS', 'CCS_Label'])
   .icode.str.split(', ')
   .apply(lambda x: range(int(x[0]), int(x[1])+1))
   .explode()
   .reset_index())

   CCS   CCS_Label icode
0  232  Anesthesia   112
1  232  Anesthesia   113
2  232  Anesthesia   114
3  232  Anesthesia   115
4  232  Anesthesia   118
5  232  Anesthesia   119
6  232  Anesthesia   120

Starting data

import pandas as pd
df = pd.DataFrame({'CCS': [232, 232], 'CCS_Label': ['Anesthesia']*2,
                   'icode': ['0112, 0115', '0118, 0120']})
ALollz
  • 57,915
  • 7
  • 66
  • 89
  • Its almost there. But it is throwing error message as: ```'Series' object has no attribute 'explode'``` – Raghavendra S Aug 09 '19 at 07:31
  • 1
    @RaghavendraS what version of `pandas` do you have? That's new as of `0.25` Upgrade if you can, otherwise see https://stackoverflow.com/questions/53218931/how-to-unnest-explode-a-column-in-a-pandas-dataframe for a way to do it manually – ALollz Aug 09 '19 at 14:34
  • I am using 0.23.4. And I got my results. I used code: ```df1 = df.icode.str.split(', ').apply(pd.Series).stack().rename('expanded').to_frame().reset_index(1, drop = True) df = pd.merge(df, df1, how = 'left', left_index=True, right_index=True) df['expanded'] = df.expanded.astype('str').str.zfill(5)``` . Thank you so much your idea of splitting the list in the cell values got my answer. Thank you once again – Raghavendra S Aug 09 '19 at 18:07