0

I need to do the data validation for range. To check wheather the column values are within the given range if the value is greater or less than the given range error should occur and display the row no or index where the error has been occured .

my data is as follows:

Draft_Fore 12 14 87 16 90

It should produce the error for the value 87 and 90 as I have considered the range of the column must be greater than 5 and less than 20.

The code which I have tried is as follows:

def validate_rating(Draft_Fore):
    Draft_Fore = int(Draft_Fore)
    if Draft_Fore > 5 and Draft_Fore <= 20:
       return True
    return False
df = pd.read_csv("/home/anu/Desktop/dr.csv")
for i, Draft_Fore in enumerate(df):
try:
    validate_rating(Draft_Fore)
except Exception as e: 
    print('Error at index {}: {!r}'.format(i, Draft_Fore))
    print(e)

To print the location where the error has occured in the row

redhima
  • 19
  • 1
  • 9
  • please check the indentation of your code, it seems to be incorrect. then, your function does not raise an error, it just returns True or False... you could use that e.g. as `if not validate_rating(Draft_Fore):`, then print the message. – FObersteiner Oct 06 '19 at 10:43
  • i have correted the intendation of my code but the for loop does not iterate properly@ MrFuppes – redhima Oct 06 '19 at 11:04

1 Answers1

0

A little explanation to clarify my comment. Assuming your dataframe looks like

df = pd.DataFrame({'col1': [12, 14, 87, 16, 90]})

you could do

def check_in_range(v, lower_lim, upper_lim):
    if lower_lim < v <= upper_lim:
       return True
    return False

lower_lim, upper_lim = 5, 20
for i, v in enumerate(df['col1']):
    if not check_in_range(v, lower_lim, upper_lim):
        print(f"value {v} at index {i} is out of range!")

# --> gives you
value 87 at index 2 is out of range!
value 90 at index 4 is out of range!

So your check function is basically fine. However, if you call to enumerate a df, the values will be the column names. What you need is to enumerate the specific column.

Concerning your idea to raise an exception, I'd suggest to have a look at raise and assert.

So you could e.g. use raise:

for i, v in enumerate(df['col1']):
    if not check_in_range(v, lower_lim, upper_lim):
        raise ValueError(f"value {v} at index {i} is out of range")

# --> gives you
ValueError: value 87 at index 2 is out of range

or assert:

for i, v in enumerate(df['col1']):
    assert v > lower_lim and v <= upper_lim, f"value {v} at index {i} is out of range"

# --> gives you
AssertionError: value 87 at index 2 is out of range

Note: If you have a df, why not use its features for convenience? To get the in-range values of the column, you could just do

df[(df['col1'] > lower_lim) & (df['col1'] <= upper_lim)]

# --> gives you
   col1
0    12
1    14
3    16
FObersteiner
  • 22,500
  • 8
  • 42
  • 72