0

I have a small excel file that contains prices for our online store & I am trying to automate this process, however, I don't fully trust the stuff to properly qualify the data, so I wanted to use Pandas to quickly check over certain fields, I have managed to achieve everything I need so far, however, I am only a beginner and I cannot think of the proper way for the next part.

So basically I need to qualify 2 columns on the same row, we have one column MARGIN, if this column is >60, then I need to check that the MARKDOWN column on the same row is populated == YES.

So my question is, how can I code it to basically say-

Below is an example of the way I have been doing my other checks, I realise it is quite beginner-ish, but I am only a beginner.

sku2 = df['SKU_2']
comp_at = df['COMPARE AT PRICE']
sales_price = df['SALES PRICE']
dni_act = df['DO NOT IMPORT - action']
dni_fur = df['DO NOT IMPORT - further details']
promo = df['PROMO']
replacement = df['REPLACEMENT']
go_live_date = df['go live date']
markdown = df['markdown']

# sales price not blank check
for item in sales_price:
    if pd.isna(item):
        with open('document.csv', 'a', newline="") as fd:
            writer = csv.writer(fd)
            writer.writerow(['It seems there is a blank sales price in here', str(file_name)])
            fd.close
            break
leonheess
  • 16,068
  • 14
  • 77
  • 112
Sean
  • 59
  • 1
  • 9

1 Answers1

1

Example:

df = pd.DataFrame([
    ['a',1,2],
    ['b',3,4],
    ['a',5,6]],
    columns=['f1','f2','f3'])

# | represents or
print(df[(df['f1'] == 'a') & (df['f2'] > 1)])

Output:

  f1  f2  f3
2  a   5   6
rrrttt
  • 449
  • 2
  • 9
  • Thanks for the reply Gabriel, so if I want to log instances that meet this criteria, so for example if I have 6 records that meet the above criteria, will they be stored as an array if I assign them to a variable, for example ```test= df.loc[df['price'] > 0]``` All the matching records would go into test? Am I following correctly? Sorry if its a stupid question. – Sean Feb 14 '20 at 11:43
  • 1
    They will be stored as another data frame, if you want to store them as an array use the notation: `test = df.loc[df['price] > 0].values` and yes, they would. – rrrttt Feb 14 '20 at 11:48
  • If I have two criterias what is the best way to store the information so I can use for logging, as using .values is not possible with two different criteria is it? – Sean Feb 14 '20 at 12:29
  • 1
    the .values transforms dataframes into arrays, since you want to explore and analyse your data you should choose the dataframe notation, it's easy to manipulate and you also don't lose the meaning of your features. – rrrttt Feb 14 '20 at 12:37