1

I have a dataframe with some columns that I want to delete. I already know how to delete a column if I find some specified text, as:

df.drop(columns=[col for col in df.columns if 'text.'in str(col)],inplace=True)

I would like to also delete the columns which contains, along the total text, different patterns, as:

"text.Corolary.sub.ramdon.sta", "text.paint.ss1b.docto.not.sta"

I want to delet all the columns which contains "text." but also ".sta". How can I combine it in the same command independently of the rest of the text?

srgam
  • 366
  • 1
  • 13
  • 1
    i think there are better ways to drop columns with specified text than what you have. It would be helpful tho, if you provided a sample dataframe with ur expected output. Kindly use this as a guide: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – sammywemmy Feb 28 '20 at 12:23

1 Answers1

1

Use boolean indexing with DataFrame.loc and masks chain by & for bitwise AND, last filter by inverse masks by ~:

m1 = df.columns.str.contains('text')
m2 = df.columns.str.contains('\.sta')
#alternative
#m2 = df.columns.str.contains('.sta', regex=False)

mask = m1 & m2
df = df.loc[:, ~mask]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252