I have the following pandas dataframe.
epi_week state loc_type disease cases incidence
21835 200011 WY STATE MUMPS 2 0.40
21836 197501 WY STATE POLIO 3 0.76
21837 199607 WY STATE HEPATITIS 3 0.61
21838 197116 WY STATE MUMPS 6 1.73
21839 200048 WY STATE HEPATITIS 6 1.21
I am trying to replace each disease
by a unique integer. For example 'MUMPS'==1
, 'POLIO'==2
etc. The final dataframe should look like follows:
epi_week state loc_type disease cases incidence
21835 200011 WY STATE 1 2 0.40
21836 197501 WY STATE 2 3 0.76
21837 199607 WY STATE 3 3 0.61
21838 197116 WY STATE 1 6 1.73
21839 200048 WY STATE 3 6 1.21
I am using the following code:
# creating a dictionary
disease_dic = {'MUMPS':1, 'POLIO':2, 'MEASLES':3, 'RUBELLA':4,
'PERTUSSIS':5, 'HEPATITIS A':6, 'SMALLPOX':7,
'DIPHTHERIA':8}
data.disease = [disease_dic[item] for item in data.disease]
I am getting following errors:
KeyErrorTraceback (most recent call last)
<ipython-input-115-52394901c90d> in <module>()
----> 1 cdc.disease = [disease_dic[item2] for item2 in cdc.disease]
KeyError: 1
Can anyone please help about this issue? Thank you.