6

I want to use a WHERE statement with two variables within the where clause. I've done research on this looking at how to use variables in SQL statements in Databricks and Inserting Variables Using Python, Not Working. I've tried to implement the solutions provided but it's not working.

a= 17091990
b = 30091990

df = spark.sql(' SELECT * FROM table WHERE date between "a" AND "b" ')
Jozamvg
  • 167
  • 2
  • 2
  • 12
  • Whats the error that you are getting? Maybe because of the date format? Date format should be separated by dash(-), somthing like this I Think A = '09-17-1990' and B = '09-30-1990' – Buchiman Aug 20 '19 at 01:17
  • Can you clarify how it is "not working". Can you provide some sample data and also show what the expected vs actual output is? – Dijkgraaf Aug 20 '19 at 01:17
  • @RonelCalinisan No, the value its an int – Jozamvg Aug 20 '19 at 01:39
  • @Dijkgraaf I dont know how to do it. Ive been trying a different ways to use the variables in the query but doesnt work – Jozamvg Aug 20 '19 at 01:40
  • @Jozamvg so your date there is int and not Date? – Buchiman Aug 20 '19 at 01:42
  • @Jozamvg try reading this. [Link](https://stackoverflow.com/questions/902408/how-to-use-variables-in-sql-statement-in-python) – Buchiman Aug 20 '19 at 01:44

1 Answers1

4

You can use python's formatted string literals

df = spark.sql(f"SELECT * FROM table WHERE date between {a} AND {b} ")

For more about formatted string literals you can refer to https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep498

anshul_cached
  • 684
  • 5
  • 18
  • Strongly discourage the direct manipulation of strings to build a query, which is vulnerable to SQL injection. Use parameters instead: https://docs.databricks.com/dev-tools/python-sql-connector.html – bruno.braga Oct 27 '22 at 00:30