-1

I'm trying to drop all columns from a df that start with any of a list of strings. I needed to copy these columns to their own dfs, and now want to drop them from a copy of the main df to make it easier to analyze.

df.columns = ["AAA1234", "AAA5678", "BBB1234", "BBB5678", "CCC123", "DDD123"...]

Entered some code that gave me this dataframes with these columns:

aaa.columns = ["AAA1234", "AAA5678"]
bbb.columns = ["BBB1234", "BBB5678"]

I did get the final df that I wanted, but my code felt rather clunky:

droplist_cols = [aaa, bbb]
droplist = []
for x in droplist_cols:
    for col in x.columns:
        droplist.append(col)
df1 = df.drop(labels=droplist, axis=1)

Columns of final df:

df1.columns = ["CCC123", "DDD123"...]

Is there a better way to do this?

--Edit for sample data--

df = pd.DataFrame([[1, 2, 3, 4, 5], [1, 3, 4, 2, 1], [4, 6, 9, 8, 3], [1, 3, 4, 2, 1], [3, 2, 5, 7, 1]], columns=["AAA1234", "AAA5678", "BBB1234", "BBB5678", "CCC123"])

Desired result:

   CCC123
0    5
1    1
2    3
3    1
4    1

  • Please provide a small set of sample data as text that we can copy and paste. Include the corresponding desired result. Check out the guide on [how to make good reproducible pandas examples](https://stackoverflow.com/a/20159305/3620003). – timgeb Jun 08 '20 at 20:33
  • `df1 = df.drop([*aaa, *bbb], axis=1)` – cs95 Jun 08 '20 at 20:35
  • Not sure I understand you. Is `[aaa, bbb]` a list of dataframes? – gosuto Jun 08 '20 at 20:36
  • @jorijnsmit, yes, sorry, I'm having trouble explaining. – Chris Matsuoka Jun 08 '20 at 20:48

1 Answers1

0

IICU

Lets begin with a dataframe thus;

df=pd.DataFrame({"A":[0]})

Modify dataframe to include your columns

df2=df.reindex(columns=["AAA1234", "AAA5678", "BBB1234", "BBB5678", "CCC123", "DDD123"], fill_value=0)

Drop all columns starting with A

df3=df2.loc[:,~df2.columns.str.startswith('A')]

If you need to drop say A OR B I would

df3=df2.loc[:,~(df2.columns.str.startswith('A')|df2.columns.str.startswith('B'))]
wwnde
  • 26,119
  • 6
  • 18
  • 32
  • So the issue that I struggled with articulating is that when I get to the "startswith", I don't know how to drop columns that start with AAA and BBB but not CCC, DDD, etc. I tried something to the effect of: ```...startswith("AAA", "BBB")]``` But that didn't work. – Chris Matsuoka Jun 08 '20 at 20:53
  • See my edits. Not sure you can pass a list. Maybe... I just dont know how. See my edits for multiple. – wwnde Jun 08 '20 at 21:03
  • Yeah, I was running into that sort of problem as well. That's why I ended up using the nested for loops. It reads a little cleaner to me than a long list of ```startswith("A"), startswith("B")```, etc. I only used two dfs in my example, but I'm removing columns that I extracted into four different dfs, for a total of around 50 columns, so I wasn't about to manually type each column. haha – Chris Matsuoka Jun 08 '20 at 21:07