1

When my dataset contains all the variables, I can create a subset by selecting just the variables I need. But If one if missing I got Nothing. How can I manage such an error ?

df = df_ori[[ 'FINAL', 'DUE', 'ID', 'NAME', 'BUSINESS 1', 'TAX 2', 'COUNT' ]] This works and df exists when all the variables in df_ori are there. But then I want to go further by managing a potential error if one variable is missing.

if df.empty: print("Field(s) mendatory missing") #doesn't work if in df_ori dataframe the field 'ID' is missing for instance. I cannot enter in this "IF"

No error but there is no dataframe df generated or "Field(s) mendatory missing" doesn't appear

Alex Andre
  • 51
  • 1
  • 2
  • 7
  • Welcome to stack overflow! Unfortunately your question is not entirely clear. Please take a look at [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and edit your question to include a [mcve], including sample input, sample output, and code for what you've tried so far. This will help us to help you better – G. Anderson Aug 26 '19 at 16:42

1 Answers1

0

IIUC, use reindex with axis=1:

df = pd.DataFrame(np.random.randint(0,10,(5,5)), columns=[*'ABCDE'])

where,

df[['A','B','C','Z']]

generates KeyError: "['Z'] not in index"

Use,

df.reindex(['A','B','C','Z'], axis=1)

Output:

   A  B  C   Z
0  9  9  8 NaN
1  2  6  7 NaN
2  6  6  6 NaN
3  3  7  9 NaN
4  7  2  2 NaN
Scott Boston
  • 147,308
  • 15
  • 139
  • 187