How to generate numbers missing between two numbers(having in a single cell) and append the generated values as new rows to pandas DataFrame

Question

I have DF as:

code_range   CCS    CCS_Label   icode
'0112-0115'  232    Anesthesia  0112, 0115
'0118-0120'  232    Anesthesia  0118, 0120

I need to generate the numbers between two values of icode(column) and append those values to df keeping other column-values same. And delete the old rows. Somewhat like expanding rows. My excepted solution:

code_range   CCS    CCS_Label   icode
'0112-0115'  232    Anesthesia  0112
'0112-0115'  232    Anesthesia  0113
'0112-0115'  232    Anesthesia  0114
'0112-0115'  232    Anesthesia  0115
'0118-0120'  232    Anesthesia  0118
'0118-0120'  232    Anesthesia  0119
'0118-0120'  232    Anesthesia  0120

On spending good amount of time also I am not able to find the right approach to the problem. Please help to resolve the problem.

Thanks in Advance..

Do you have an attempt? Moreover, if this is some sort of healthcare data, make sure your codes in mean what you think they mean. — ifly6, Aug 08 '19 at 18:41
I have tried with Series.between(value, value). range(value, value). None is giving the right approach to generate values and even after generating values how to append. Yes, you are right, its healthcare data. I have further more steps on getting solution...Thank you:) — Raghavendra S, Aug 08 '19 at 18:46
can you print `df.head().to_dict()` Its that column the string `'0112, 0115'`? — ALollz, Aug 08 '19 at 18:58
Yes, it is string as you said. Actually, icode(column) in my .csv file had string values('0112-0115' and '0118-0120'..so on). I extracted string values as (0112, 0115 and 0118, 0120..so on ). I am updating the same in the post. sorry I missed '0' infront of the values And I am updating the same.....Thank you. — Raghavendra S, Aug 09 '19 at 05:55

score 0 · Answer 1 · answered Aug 08 '19 at 19:07

0

Not particularly performant, but split, create the list, then explode. Assumes all your codes are ints, and that you don't want the leading 0s, like in your output:

(df.set_index(['CCS', 'CCS_Label'])
   .icode.str.split(', ')
   .apply(lambda x: range(int(x[0]), int(x[1])+1))
   .explode()
   .reset_index())

   CCS   CCS_Label icode
0  232  Anesthesia   112
1  232  Anesthesia   113
2  232  Anesthesia   114
3  232  Anesthesia   115
4  232  Anesthesia   118
5  232  Anesthesia   119
6  232  Anesthesia   120

Starting data

import pandas as pd
df = pd.DataFrame({'CCS': [232, 232], 'CCS_Label': ['Anesthesia']*2,
                   'icode': ['0112, 0115', '0118, 0120']})

answered Aug 08 '19 at 19:07

ALollz

57,915
7
66
89

Its almost there. But it is throwing error message as: ```'Series' object has no attribute 'explode'``` – Raghavendra S Aug 09 '19 at 07:31
1

@RaghavendraS what version of `pandas` do you have? That's new as of `0.25` Upgrade if you can, otherwise see https://stackoverflow.com/questions/53218931/how-to-unnest-explode-a-column-in-a-pandas-dataframe for a way to do it manually – ALollz Aug 09 '19 at 14:34
I am using 0.23.4. And I got my results. I used code: ```df1 = df.icode.str.split(', ').apply(pd.Series).stack().rename('expanded').to_frame().reset_index(1, drop = True) df = pd.merge(df, df1, how = 'left', left_index=True, right_index=True) df['expanded'] = df.expanded.astype('str').str.zfill(5)``` . Thank you so much your idea of splitting the list in the cell values got my answer. Thank you once again – Raghavendra S Aug 09 '19 at 18:07

How to generate numbers missing between two numbers(having in a single cell) and append the generated values as new rows to pandas DataFrame

1 Answers1