I am new to synapse analytics and I want to create a notebook, which could be used further as a databrick in the pipeline in order to process the data in one of the tables from the DB.
I want to do everything in PySpark, so I am wondering how to even read the data from existing DB, using the notebook. NOTE: I dont want to use the method, where I am mentioning my password etc.
spark = SparkSession.builder.appName("SynapseAnalyticsDemo").getOrCreate()
# Set up Synapse Analytics credentials
synapse_servername = "<synapse_servername>.sql.azuresynapse.net"
synapse_database = "<synapse_database>"
synapse_username = "<synapse_username>"
synapse_password = "<synapse_password>"
synapse_jdbc_url = f"jdbc:sqlserver://{synapse_servername}:1433;database={synapse_database};user={synapse_username};password={synapse_password}"
# Define the SQL query and table name
table_name = "<sql_pool_table>"
query = f"SELECT * FROM {table_name} WHERE some_column = 'some_value'"
This is the only method I have found, but taking in account that many people will be able to see the databrick (notebook), I dont want to use this method, where I have to mention the password.
I have also tried this way, but it does not work:
%%pyspark
df=spark.sql ("select * from TableName")
df.show()
In the second method I have written Table Name exactly as it is in SQL Server Management Studio, then I tried to use Azure Synapse dedicated SQL pool name and none of them worked.
How can I access the Azure Synapse dedicated SQL pool using the notebook (PySpark)?