2

I have a Dataframe df_main with the following columns:

ID Category Time Status XYZ
1 A value value value
2 B value value value
3 C value value value
4 D value value value
5 E value value value

Using the following code, I have created new Dataframes based on Categories in the table. I have created a Dataframe dictionary and created the dataframes in this format df_A, df_B, df_C... I have stored the row in the new Dataframes equivalent to the Category Name. So, df_A will have the row from df_main which has the Category value "A".

Code:

dict_of_df = {} # initialize empty dictionary

i=0
for index, row in df_main.iterrows():
    
    if i<5:
        newname = df_main['Category'].values[i]
        dict_of_df["df_{}".format(newname)] = row
        
    i=i+1

I want to print the dataframes by their dataframe name, and not by iterating the dictionary. It should be like this:

print(df_A)
print(df_B)
print(df_C)
print(df_D)
print(df_E)

How can I achieve this? A solution without using a dictionary would work too. Any solution is fine as long as I am able to store a row of a specific Category in a new Dataframe specific to Category Name and print it using the Dataframe name.

Let me know if more details are required.

Edit:

This link is somewhat similar to my use case: Using String Variable as variable name

I wanted to be specific to dataframes, as my end goal was to print the dataframes by their names.

The method mentioned in the answers of that link is specific to variables and would need a different code solution using the exec method for dataframes.

The idea behind this code is to include it in Power BI. Get Source using python script in Power BI accepts dataframes as tables, for which, I would have to declare or print a dataframe in the code.

ParshvaShah
  • 125
  • 1
  • 11
  • 1
    Does this answer your question? [Using a string variable as a variable name](https://stackoverflow.com/questions/11553721/using-a-string-variable-as-a-variable-name) – nidabdella Apr 19 '21 at 21:02
  • 2
    *Use the dict*? `print(dict_of_df["df_A"])`??? – juanpa.arrivillaga Apr 19 '21 at 21:09
  • whats wrong with using a dict? also to shorten your code - try : `dfs = {f"df_{cat}" : grp for cat,grp in df.groupby('Category')}` – Umar.H Apr 19 '21 at 21:10
  • 1
    If you don't mind to write something like `print(dict_of_df.df_A)`, this answer is plenty of options for you: https://stackoverflow.com/questions/3031219/recursively-access-dict-via-attributes-as-well-as-index-access IMHO, I would use the last answer, that provides a function to convert a python dict into a helper object called PropertyTree. – wensiso Apr 19 '21 at 21:31
  • @nidabdella no, I think it does not answer it. – ParshvaShah Apr 19 '21 at 21:39
  • @juanpa.arrivillaga yes, that answers it. I had tried this but it did not work then, must have been a small mistake. I wanted to convert the dict.values back to dfs which is easily doable. – ParshvaShah Apr 19 '21 at 21:39
  • @Umar Nothing is wrong with using a dict. Thanks for the groupby suggestion. – ParshvaShah Apr 19 '21 at 21:39

1 Answers1

0

change your dataframe source

import pandas as pd

df_main = pd.read_excel('main.xlsx') # use data source as per your requirement 

dict_of_df = {}  # initialize empty dictionary

i = 0
for index, row in df_main.iterrows():
    if i < 5:
        print('df_'+row['Category'])
        newname = df_main['Category'].values[i]
        dict_of_df["df_{}".format(newname)] = row

    i = i + 1
Hietsh Kumar
  • 1,197
  • 9
  • 17
  • Why do you recommend changing it? Plus I cannot change it, as the data source keeps on updating and I will be putting this code in Power BI which might increase the processing time. – ParshvaShah Apr 20 '21 at 11:36
  • I have used an excel pd.read_excel('main.xlsx') , on the place of excel you can use any other method as per your requirement ,, it might be DB , API, etc... it wond make any diffrence – Hietsh Kumar Apr 20 '21 at 11:39
  • my data source is SQL. – ParshvaShah Apr 20 '21 at 12:12
  • df = pd.read_sql_table('TableName',connection)....https://www.geeksforgeeks.org/read-sql-database-table-into-a-pandas-dataframe-using-sqlalchemy/ .................. .... https://www.dataquest.io/blog/python-pandas-databases/ – Hietsh Kumar Apr 20 '21 at 12:19