0

I have a dataframe full_report as shown below. As you can see, the column quotient contains empty cells.

      Year  Week  VehicleOwnerID  GarageDepot       On      Off  quotient
3     2021     8          101550  Fredriksdal      0.0     -1.0 -0.000000
4     2021     9          101550  Fredriksdal      0.0     -1.0 -0.000000
5     2021    10          101550  Fredriksdal      0.0     -1.0 
6     2021    11          101550  Fredriksdal      0.0     -1.0 -0.000000
7     2021    12          101550  Fredriksdal      0.0     -1.0
...    ...   ...             ...          ...      ...      ...       ...
4843  2021     8            6128   Älvsjö/Bro   6475.0   6392.0  1.012985
4844  2021     9            6128   Älvsjö/Bro   9390.0   9258.0  1.014258
4845  2021    10            6128   Älvsjö/Bro   9293.0   9017.0  
4846  2021    11            6128   Älvsjö/Bro  10794.0  10669.0  
4847  2021    12            6128   Älvsjö/Bro  11332.0  11105.0  1.020441

The dtypes are given as

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2891 entries, 3 to 4847
Data columns (total 7 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Year            2891 non-null   int64  
 1   Week            2891 non-null   int64  
 2   VehicleOwnerID  2891 non-null   int64  
 3   GarageDepot     2891 non-null   object 
 4   On              2891 non-null   float64
 5   Off             2891 non-null   float64
 6   quotient        2891 non-null   float64
dtypes: float64(3), int64(3), object(1)
memory usage: 180.7+ KB

Now, what I need is to replace these empty cells with NaN since I, in a later step want to associate a rank to quotients within an intervall

full_report.loc[(full_report['quotient'] > 1.15), 'Critical_level']                        = 3
full_report.loc[(full_report['quotient'] <= 1.15) & (full_report['quotient'] > 1.10), 'Critical_level'] = 2
full_report.loc[(full_report['quotient'] <= 1.10) & (full_report['quotient'] > 1.05), 'Critical_level'] = 1
full_report.loc[(full_report['quotient'] <= 1.05) & (full_report['quotient'] > 0.95), 'Critical_level'] = 0
full_report.loc[(full_report['quotient'] <= 0.95) & (full_report['quotient'] > 0.90), 'Critical_level'] = 1
full_report.loc[(full_report['quotient'] <= 0.90) & (full_report['quotient'] > 0.85), 'Critical_level'] = 2
full_report.loc[(full_report['quotient'] <= 0.85), 'Critical_level']                                    = 3
full_report.loc[(full_report['quotient'] ==-0.000000), 'Critical_level']                                = "Tömmer ej"
full_report.loc[(full_report['quotient'].isna()), 'Critical_level']                                = "Kommunicerar ej"


full_report['Off'] = full_report['Off'].replace(-1,0)

The crucial point here is this row:

full_report.loc[(full_report['quotient'].isna()), 'Critical_level']                                = "Kommunicerar ej"

I thought this was a no-brainer and tried:

fillna()

full_report['quotient'] = full_report['quotient'].fillna('')

This did absolutely nothing.

Replace() by making the column into an object and then reverting

full_report['quotient'].astype(str).replace('', np.nan)
full_report['quotient'].astype(float)

this didn't do the trick either. I even tried

full_report.replace(r'^\s*$', np.nan, regex=True)

which did help either.

So, my question is: What am I missing here? Any suggestions?

0 Answers0