-1

I have the table below:

import pandas as pd

raw_data = {
    'vendor_id': [1, 2, 3, 4, 5, 6], 
    'name': ['vendor_schmendor', 'parts_r_us', 'vendor_king', 'vendor_diagram', 'venny', 'vendtriloquist'], 
    'contract_sign_date': ['2018-09-01', '2018-09-03', '2018-10-11', '2018-08-21', '2018-08-13', '2018-10-29'],
    'total_spend' :[34324, 23455, 77654, 23334, 94843, 23444]}

df = pd.DataFrame(raw_data, columns = ['vendor_id', 'name', 'contract_sign_date', 'total_spend'])

I was given a task where I have to drop all the rows where the contract_sign_date is between "2018-09-01" and "2018-10-13", this is my solution (although it doesn't work):

alter = df.drop((df['contract_sign_date'] == "2018-09-01") & (df['contract_sign_date'] == "2018-10-13"))

The output throws: KeyError: '[False, False, False, False, False, False] not found in axis'

So can anyone provide a code so in order that I can construct what I was desired for?

Sunderam Dubey
  • 1
  • 11
  • 20
  • 40
  • 2
    Does this answer your question? [Select DataFrame rows between two dates](https://stackoverflow.com/questions/29370057/select-dataframe-rows-between-two-dates) – Ynjxsjmh May 28 '22 at 07:39
  • @Ynjxsjmh I followed the solution and ended up with this code: **alter = (df['contract_sign_date'] > "2018-09-01") & (df['contract_sign_date'] <= "2018-08-13") indexing = df.drop(df.loc[alter],axis='columns')** And it displays only the index rows value 0,1,2,3,4,5 with no columns –  May 28 '22 at 07:44
  • Which answer suggest you use `drop`? – Ynjxsjmh May 28 '22 at 07:50

2 Answers2

2

Your condition is to check simultaneous equality with two different values (a == b) and (a==c), which is impossible.

Use between and the boolean NOT operator ~:

alter = df[~df['contract_sign_date'].between("2018-09-01", "2018-10-13")]

output:

   vendor_id            name contract_sign_date  total_spend
3          4  vendor_diagram         2018-08-21        23334
4          5           venny         2018-08-13        94843
5          6  vendtriloquist         2018-10-29        23444

NB. we're using strings here as the YYYY-MM-DD format enables direct comparison, with a different format you would need to use a datetime type

mozway
  • 194,879
  • 13
  • 39
  • 75
0

If you want to use drop, you can try

m = (df['contract_sign_date'] < "2018-09-01") & (df['contract_sign_date'] >= "2018-08-13")
# or
m = df['contract_sign_date'].between("2018-08-13", "2018-09-01", inclusive="left")

out = df.drop(m[~m].index)
print(out)

   vendor_id            name contract_sign_date  total_spend
3          4  vendor_diagram         2018-08-21        23334
4          5           venny         2018-08-13        94843
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52