Drop Columns that starts with any of a list of strings Pandas

Question

I'm trying to drop all columns from a df that start with any of a list of strings. I needed to copy these columns to their own dfs, and now want to drop them from a copy of the main df to make it easier to analyze.

df.columns = ["AAA1234", "AAA5678", "BBB1234", "BBB5678", "CCC123", "DDD123"...]

Entered some code that gave me this dataframes with these columns:

aaa.columns = ["AAA1234", "AAA5678"]
bbb.columns = ["BBB1234", "BBB5678"]

I did get the final df that I wanted, but my code felt rather clunky:

droplist_cols = [aaa, bbb]
droplist = []
for x in droplist_cols:
    for col in x.columns:
        droplist.append(col)
df1 = df.drop(labels=droplist, axis=1)

Columns of final df:

df1.columns = ["CCC123", "DDD123"...]

Is there a better way to do this?

--Edit for sample data--

df = pd.DataFrame([[1, 2, 3, 4, 5], [1, 3, 4, 2, 1], [4, 6, 9, 8, 3], [1, 3, 4, 2, 1], [3, 2, 5, 7, 1]], columns=["AAA1234", "AAA5678", "BBB1234", "BBB5678", "CCC123"])

Desired result:

Please provide a small set of sample data as text that we can copy and paste. Include the corresponding desired result. Check out the guide on [how to make good reproducible pandas examples](https://stackoverflow.com/a/20159305/3620003). — timgeb, Jun 08 '20 at 20:33
Not sure I understand you. Is `[aaa, bbb]` a list of dataframes? — gosuto, Jun 08 '20 at 20:36

wwnde · Answer 1 · 2020-06-08T21:01:59.507

0

IICU

Lets begin with a dataframe thus;

df=pd.DataFrame({"A":[0]})

Modify dataframe to include your columns

df2=df.reindex(columns=["AAA1234", "AAA5678", "BBB1234", "BBB5678", "CCC123", "DDD123"], fill_value=0)

Drop all columns starting with A

df3=df2.loc[:,~df2.columns.str.startswith('A')]

If you need to drop say A OR B I would

df3=df2.loc[:,~(df2.columns.str.startswith('A')|df2.columns.str.startswith('B'))]

edited Jun 08 '20 at 21:01

answered Jun 08 '20 at 20:46

wwnde

26,119
6
18
32

So the issue that I struggled with articulating is that when I get to the "startswith", I don't know how to drop columns that start with AAA and BBB but not CCC, DDD, etc. I tried something to the effect of: ```...startswith("AAA", "BBB")]``` But that didn't work. – Chris Matsuoka Jun 08 '20 at 20:53
See my edits. Not sure you can pass a list. Maybe... I just dont know how. See my edits for multiple. – wwnde Jun 08 '20 at 21:03
Yeah, I was running into that sort of problem as well. That's why I ended up using the nested for loops. It reads a little cleaner to me than a long list of ```startswith("A"), startswith("B")```, etc. I only used two dfs in my example, but I'm removing columns that I extracted into four different dfs, for a total of around 50 columns, so I wasn't about to manually type each column. haha – Chris Matsuoka Jun 08 '20 at 21:07

Drop Columns that starts with any of a list of strings Pandas

1 Answers1