0

I want to pass a local df as a table to inner join to an SQL server like so.

sql = """ 
select top 10000 * 
from Table1 as t
inner join {} as a on t.id= a.id
""".format(pandas_df)

results = pd.read_sql_query(sql,conn)

This is obviously not the way to do it. Any ideas?

Thanks!

Logica
  • 977
  • 4
  • 16
Edi Itelman
  • 423
  • 5
  • 14
  • Does this answer your question? [How to insert pandas dataframe via mysqldb into database?](https://stackoverflow.com/questions/16476413/how-to-insert-pandas-dataframe-via-mysqldb-into-database) – dspencer Apr 14 '20 at 11:12
  • Two options, 1. create a dataframe from `Table1` and do `pd.merge(df1, df2, how='inner', on='id')` 2. write `pandas_df` into the db as a table and do inner join in the database, get results as df. – ywbaek Apr 14 '20 at 13:03

3 Answers3

0

Can use df.to_sql to load it to the df.

Kalana
  • 5,631
  • 7
  • 30
  • 51
Edi Itelman
  • 423
  • 5
  • 14
0

You need to convert your dataframe to a SQL table before reading it.

Use pd.pandas_df.to_sql(name_of_table, con)

Henrique Branco
  • 1,778
  • 1
  • 13
  • 40
0

I see two main options, depending on the data size of your id's. The simplest way would be to add the id to an IN clause in your SQL statement. This approach is useful if you don't have write permission on the database, but you limited by the maximum batch size of SQL, which iirc is around 256Mb.

From your id series, you create a tuple of id's you're interested in, then cast the tuple to a string to concatenate with you sql statement.

sql = """ 
select top 10000 * 
from Table1 as t
where t.id in """ + str(tuple(pandas.df['id'].values))

results = pd.read_sql_query(sql,conn)
el_oso
  • 1,021
  • 6
  • 10