1

I have a dataframe df with 2 columns price and max_price. I want have these validations:

  1. the values in max_price column must be => 0.
  2. the values in price column must be => 0, but also be <= max_price in the corresponding row.

I have the following code. I only know how to check => 0 for the price column, but don't know how to check if it is higher than the value in max_price column.

import pandas as pd
from pandas_schema import Column, Schema
from pandas_schema.validation import InRangeValidation


df = pd.DataFrame({'price': [1, 10, 20], 'max_price': [10, 5, 25]})

schema = Schema([
    Column('price', [InRangeValidation(min=0)]),  # how to check with the column `max_price`?
    Column('max_price', [InRangeValidation(min=0)])
])

errors = schema.validate(df)

for error in errors:
    print(error)

Here, the 2nd row in df is invalid, since the price is 10 while the max_price is just 5.

Could you please show me how to make the code correct? Thanks.

aura
  • 383
  • 7
  • 24

1 Answers1

0
Lst=df['price'].values
row=0
for v in df['max_price'].values:
    if(Lst[row]>v):
       print("error at row:",row)
    row=row+1
  • Thanks for your answer. Your solution checks the value in a standalone manner. But do you know how to use pandas_schema to solve it? Because it is more inline with the rest of my code. – aura Sep 11 '20 at 21:05
  • https://stackoverflow.com/questions/27474921/compare-two-columns-using-pandas – Linga Lgm Cse Sep 11 '20 at 21:11