1

I have many features that I need to vectorize and apply functions down the road. Rather than manually making copies of each DF, filtering down, then applying my various functions, I'd rather dynamically create new DF's based on the values contained with specified columns.

Take my code down below for example. I'd like to take Column B and create three new dataframes. df_A, df_B, df_C.

I've scoured a few dozen posts, but these are the closest I can find: Create new dataframe in pandas with dynamic names also add new column I wasn't able to get this to work, throwing an error at this bit

dict_of_df[key_name] = copy.deepcopy(df)
TypeError: unhashable type: 'numpy.ndarray'

https://datascience.stackexchange.com/questions/29825/create-new-data-frames-from-existing-data-frame-based-on-unique-column-values I don't want to print lists, I want actual dataframes.

Here's some different code I tried to put together, but is throwing an error with the range function, though I'm not sure why...

import pandas as pd

data = {'Column A': [100,200,300,400,500],
        'Column B': ["A","A","B","B","C"]}
df = pd.DataFrame(data, columns=['Column A','Column B'])

df

for i in range(len(df['Column B'].unique())):
    for item in df['Column B'].unique():
        new_df[i] = df[df['Column B'] == item]
new_df

ValueError: Wrong number of items passed 2, placement implies 1

EDIT: Per the link that @jezrael provided in this post (the duplicate), his solution in that post met the need:

for i, x in df.groupby('Column B'):
    globals()['dataframe' + i] = x 
June
  • 720
  • 10
  • 22
  • I think `d1 = dict(tuple(df.groupby('Column B')))` should help here. – jezrael May 15 '18 at 08:41
  • I don't see how these questions relate. The link you attached refers to grouping a dataframe. I want to create new dataframes dynamically as the title and description indicates. – June May 15 '18 at 08:44
  • 1
    So sorry, duplicate answer was changed to [this](https://stackoverflow.com/questions/45114945/how-do-i-create-dynamic-variable-names-inside-a-loop-in-pandas), second solution with `globals`. But still better is dictionary approach – jezrael May 15 '18 at 08:47
  • You're a rockstar. You have no idea how much time you have saved me going forward! I implemented your solution in the link you provided. I'll update this post with that solution. – June May 15 '18 at 09:00
  • Glad can help you! :) – jezrael May 15 '18 at 09:01

0 Answers0