I want to flag the anomalies in the desired_columns
(desired_D to L). Here, an anomaly is defined as any value <1500
and >400000
in each row.
See below for the dataset
import pandas as pd
# intialise data of lists
data = {
'A':['L1', 'L2', 'L3', 'L4', 'L5'],
'B':[1,1,1,1,1],
'C':[1,2,3,5,9],
'desired_D':[12005, 18190, 1021, 13301, 31119],
'desired_E':[11021, 19112, 19021, 15, 24509 ],
'desired_F':[10022,19910, 19113,449999, 25519],
'desired_G':[14029, 29100, 39022, 24509, 412271],
'desired_H':[52119,32991,52883,69359,57835],
'desired_J':[41218, 52991,55121,69152,79355],
'desired_K': [43211,7672991,56881,211,77342],
'desired_L': [31211,42901,53818,62158,69325],
}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
df
Currently, my code flags columns B
, and C
inclusively (I want to exclude them).
The revised code looks like this:
# function to flag the anomaly in each row- this flags columns B and C as well (I want to exclude these columns)
dont_format_cols = ['B','C']
def flag_outliers(s, dont_format_cols):
if s.name in dont_format_cols:
return '' # or None, or whatever df.style() needs
else:
s = pd.to_numeric(s, errors='coerce')
indexes = (s<1500)|(s>400000)
return ['background-color: red' if v else '' for v in indexes]
styled = df.style.apply(flag_outliers, axis=1)
styled
The error after edits
Desired output: should exclude B
and C
,refer to the image below.