0

I have a script in the Jupyter notebook, which creates interactive graphs for the data set that it is provided. I then convert the output as an HTML file without inputs to create a report for that dataset to be shared with my colleagues. I have also used papermill to parametrize the process, that I send it the name of the file and it creates a report for me. All the datasets are stored in Azure datalake.

Now it is all very easy when I am doing it in my local machine, but I want to automate the process to generate reports for the new incoming datasets every hour and store the HTML outputs in the azure datalake, I want to run this automation on the cloud.

I initially began with using automation accounts, but I didnot know how to execute a jupyter notebook in the automation accounts, and where to store my .ipynb file. I have also looked at the jupyter hub server (VM) on azure cloud but I am unable to understand how to automate it as well.

Can any one help me with a way to automate the entire process on the Azure Cloud in the cheapest way possible, because I have to generate a lot of reports.

Thanks!

Rebe
  • 37
  • 2
  • 12
  • azure has tools for setting up papermill workflows out of the box: https://learn.microsoft.com/en-us/sql/azure-data-studio/notebooks/parameterize-papermill?view=sql-server-ver15. if you're looking for a DIY setup (cheaper and harder) that's way too large a scope for a stack overflow question. good luck! – Michael Delgado Feb 08 '22 at 22:01
  • Thanks @MichaelDelgado, I have already looked that the azure datastudio and notebooks, but I am confused how can I run it in the azure cloud, like automate it using automation accounts or functions triggered by time That is what I am unable to structure. Do you have any idea? – Rebe Feb 09 '22 at 09:35

1 Answers1

1

Apart from Automation, you can use Azure Functions as mentioned in this document:

· To run a PowerShell-based Jupyter Notebook, you can use PowerShell in an Azure function to call the Invoke-ExecuteNotebook cmdlet. This is similar to the technique described above for Automation jobs. For more information, see Azure Functions PowerShell developer guide.

· To run a SQL-based Jupyter Notebook, you can use PowerShell in an Azure function to call the Invoke-SqlNotebook cmdlet. For more information, see Azure Functions PowerShell developer guide.

· To run a Python-based Jupyter Notebook, you can use Python in an Azure function to call papermill. For more information, see Azure Functions Python developer guide.

References: Run Jupyter Notebook on the Cloud in 15 mins #Azure | by Anish Mahapatra | Towards Data Science, How to run a Jupyter notebook with Python code automatically on a daily basis? - Stack Overflow and Scheduled execution of Python script on Azure - Stack Overflow

Madhuraj Vadde
  • 1,099
  • 1
  • 5
  • 13
  • Thanks for your detailed answer! I will try it out. – Rebe Mar 03 '22 at 09:32
  • Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members. – Madhuraj Vadde Mar 06 '22 at 06:44