Actually I want to install a library on my Azure databricks cluster but I cannot use the UI method because every time my cluster would change and in transition I cannot add library to it using UI. Is there any databricks utility command for doing this?
-
1have you tried databricks [libraries CLI](https://docs.databricks.com/dev-tools/cli/libraries-cli.html) then install the library from DBFS. – jose praveen Mar 06 '20 at 04:43
2 Answers
@CHEEKATLAPRADEEP-MSFT's answer is awesome! Just a complement:
If you want all your notebooks / clusters to have the same libs installed, you can take advantage of cluster-scoped or global (new feature) init scripts.
The example below retrieves packages from PyPi:
#!/bin/sh
# Install dependencies
pip install --upgrade boto3 psycopg2-binary requests simple-salesforce
You can even use a private package index - for example AWS CodeArtifact:
#Install AWS CLI
pip install --upgrade awscli
# Configure pip
aws codeartifact login --region <REGION> --tool pip --domain <DOMAIN> --domain-owner <AWS_ACCOUNT_ID> --repository <REPO>
pip config set global.extra-index-url https://pypi.org/simple
Note: the cluster instance profile must be allowed to get CodeArtifact credentials (arn:aws:iam::aws:policy/AWSCodeArtifactReadOnlyAccess
).
Cheers

- 460
- 6
- 7
-
Follow up question. How do you configure the instance to get the AWS credentials? – pmanDS May 11 '22 at 06:50
-
@pmanDS We currently attach an "Instance profile" in the Advanced options of the cluster configuration page. – saza Jul 19 '22 at 01:14
-
1@CHEEKATLAPRADEEP's answer? I don't see that here? I you referring to some other post or perhaps it was delete? – StatsStudent Dec 16 '22 at 08:33
-
that answer was removed by StackOverflow moderator and couldn't be restored until another moderator will chime in. – Alex Ott Feb 23 '23 at 15:10
You can use %pip install command to install the required libraries from within your notebook code. This documentation provides further detail on its usage: https://docs.databricks.com/libraries/notebooks-python-libraries.html. For example:
!pip install requests
For older runtimes there was dbutils.library utility (https://docs.databricks.com/dev-tools/databricks-utils.html#dbutils-library) but it was deprecated.

- 214
- 2
- 6