2

What I am trying:

import re
new_df = census_df.loc[(census_df['REGION']==1 | census_df['REGION']== 2) & (census_df['CTYNAME'].str.contains('^Washington[a-z]*'))& (census_df['POPESTIMATE2015']>census_df['POPESTIMATE2014'])]
new_df

It returns this error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
a_jelly_fish
  • 478
  • 6
  • 21
user574502
  • 23
  • 4
  • 1
    welcome to SO. Could you please read this https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples, and rephrase your question in a manner that one can reproduce it? – Roy2012 Jun 12 '20 at 09:52
  • You are not using the re module, so might not need to import it? And , please produce a sample of the census_df dataframe content. – Gustav Rasmussen Jun 12 '20 at 09:56

1 Answers1

1

You need to set brackets around each logical expression in filt_1:

filt_1 = (census_df['REGION'] == 1)  | (census_df['REGION'] == 2)

Note that my data for census_df is semi-fictitious but shows the functionality. Everything from the filt_1 assignment operation and downwards will still work for your entire census_df dataframe. This is the full program:

import pandas as pd

cols = ['REGION', 'CTYNAME', 'POPESTIMATE2014', 'POPESTIMATE2015']
data = [[1, "Washington", 4846411, 4858979],
        [3, "Autauga County", 55290, 55347]]

census_df = pd.DataFrame(data, columns=cols)

filt_1 = (census_df['REGION'] == 1)  | (census_df['REGION'] == 2)
filt_2 = census_df['CTYNAME'].str.contains("^Washington[a-z]*")
filt_3 = census_df['POPESTIMATE2015'] > census_df['POPESTIMATE2014']

filt = filt_1 & filt_2 & filt_3

new_df = census_df.loc[filt]

print(new_df)

Returns:

   REGION     CTYNAME  POPESTIMATE2014  POPESTIMATE2015
0       1  Washington          4846411          4858979
Gustav Rasmussen
  • 3,720
  • 4
  • 23
  • 53