0

I am using python and PSQL to extract data but what I would want to do is to retrieve data from a specific table using a list that contains the name of desired columns.

*args could have any number of names inside it so I was wondering how to do it.

Function to read table with a given query:

def read_table_data(table_name, query, conn):
    table_name = pd.read_sql_query(query, conn)
    return table_name

Function to extract the data:

def extract_variable_table_many_columns(table, *args):   
    df = pd.DataFrame()
    col = []
    for c in args:
        print(c)
        col.append(c)
    query_extract_variables = "SELECT {} FROM {};".format(str(col),str(table))
    df1 = read_table_data("{}".format(str(table)), query_extract_variables , conn)
    df = df.append(df1)
    return df
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Mostafa Alishahi
  • 320
  • 5
  • 13

2 Answers2

0

I want to retrieve data from a specific table using a list that contains the names of desired columns.

So you want to dynamically construct your actual SQL statement, not just the usual parameter substitution. Since using string concatenation to build a SQL query is disaster waiting to happen, you should use the psycopg2.sql module. It provides a safe way to use identifiers (i.e. column names) from variables.

For more details see Passing table name as a parameter in psycopg2.

SebDieBln
  • 3,303
  • 1
  • 7
  • 21
  • thanks for your helpful reply, I just had a quick look at the link and the module but as you can see in *args the number of variables is different. I was wondering if this could be done via psycopg2.sql module? – Mostafa Alishahi Dec 31 '21 at 13:24
  • @MostafaAlishahi That should be possible using the [join method of SQL](https://www.psycopg.org/docs/sql.html#psycopg2.sql.SQL.join). The example in the documentation does pretty much what you need: Make identifiers out of strings and join these to give a list of identifiers, that can then be inserted into the actual query. – SebDieBln Dec 31 '21 at 17:12
0

Thanks again @SebDieBln, I was able to solve it as the following:

 query = sql.SQL("select {fields} from {table}").format(fields=sql.SQL(',').join([sql.Identifier(n) for n in columns]),table=sql.Identifier(table))
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Mostafa Alishahi
  • 320
  • 5
  • 13