0

There are multiple tables in SQL DB based on countries where I need to extract data from. How can I extract multiple data in one script from multiple tables to use in a dataframe to multiple outputs based on the countries. How can I call this list in the dataframe for the output?

I am merging different results at the end to export the file. Currently its based on one country only (US). I'm trying to combine other countries results in one script to export all together.

 country_name = ["US", "Canada"]  #trying to extract data from different 

SQL tables which is saved in a format target_US_category

 query = """SELECT * FROM DB.[Appservices\DBPowerUsers]. 
 [target_""" + country_name +"""_category];""" 

 df= pd.read_sql(query, engine)


 def summary_tables (df):

female = df[df['sex1']=='Female'].shape[0]
male = df[df['sex1']=='Male'].shape[0]

category = pd.DataFrame([
    ['Male', male, (round(((male/(df.shape[0]))*100),2))],
    ['Female', female, (round(((female/(df.shape[0]))*100),2))]],
    columns = ['category', 'Amt', 'Percent'])
return(category)

  def summary_tables (df1):
  female = df1[df1['sex1']=='Female'].shape[0]
  male = df1[df1['sex1']=='Male'].shape[0]

category1 = pd.DataFrame([
    ['Male', male, (round(((male/(df1.shape[0]))*100),2))],
    ['Female', female, (round(((female/(df1.shape[0]))*100),2))]],
    columns = ['category', 'Amt', 'Percent'])
return(category1)

 df_merge = pd.merge(summary_tables (df),summary_tables 
  (df1), on='category')
Josh
  • 57
  • 4

1 Answers1

0

Further to our comments, should look similar as below. Difficult to write correctly without test data.

grade_dict = {
'CL' : '1',
'CL' : '2',
'CL' : '3',
'UNDEF' : 'na',
'DEFAULT' : 'na',
'No known' : 'na'}

def summary_tables (df):
# your fucntion code here

country_name = ["US", "India", "Canada"]  # check your list, countries must be in ""

df_dict = {}  # create empty dictionary and load each table to it during the loop
for item in country_name:
    query = """SELECT *
        FROM CustomerDB.[Appservices\CustomerDBPowerUsers].[target_""" + item + """_population];"""
    df_dict[item] = df = pd.read_sql(query, engine)
    df_dict[item]['pos'] = df_dict[item]['seq'].map(grade_dict)
    df_dict[item]['emp'] = df_dict[item]['seq'].map(grade_dict)
    summary_tables(df_dict[item]) # passing dataframe to your function

# if all tables for each country have the same columns, - all df's from dictionary will be placed into one large one called df_merge
df_merge = pd.concat(df_dict.values(), ignore_index=True)
NoobVB
  • 989
  • 6
  • 10
  • Thank you for the idea, which is awesome. Is it possible to define a function based on this dictionary list? e.g. Currently, I have created a view based on multiple conditions as def create_summary_tables (US): Can I combine this dictionary list here? If yes, can you please share your thoughts how to do it and return multiple functions all together? – Josh Sep 02 '22 at 17:21
  • Sadly, I am not experienced enough and not sure if all transformations are possible directly in the dictionary, - most likely yes (at least the ones I do, do work all). so just test something like `def create_summary_tables (df_dict ['US'])` , not sure about the ending of the function, should it give `return df_dict[df]` or just `return df`. Talking about multiple functions in one go, - you should be more specific, please update your question with some code examples and probably examples of data. – NoobVB Sep 02 '22 at 17:43
  • Another problem I'm running into is - while creating a list country_name = ["US", "India", "Canada"] I'm getting an error saying "can only concatenate str (not "list") to str" ... where each table name is defined in SQL as "target_US_population" – Josh Sep 02 '22 at 18:28
  • Sorry mate, - you must share more code, at least the part which caused the error... At the moment I can only guess row `country_name = ["US", "India", "Canada"]` is not causing it as it is really simple one, but something else does... – NoobVB Sep 02 '22 at 18:41
  • or try the options from here https://stackoverflow.com/questions/902408/how-to-use-variables-in-sql-statement-in-python – NoobVB Sep 02 '22 at 18:45
  • Sorry, I had a mistake in the code logic, - updated my answer, - lets hope it works – NoobVB Sep 02 '22 at 18:51
  • I have updated my question in more detail. Please let me know if you need more clarification – Josh Sep 02 '22 at 19:13
  • check my answer again, - I can only hope it will work from the first time, - but you will get the idea and will be able to adjust. I used concat at the end, presuming all tables for each country have the same columns – NoobVB Sep 02 '22 at 19:48
  • This is great! Thanks for helping out! How can I extract different outputs for US, India & Canada? – Josh Sep 03 '22 at 00:16