4

Is there a way to access data such as Repo url and Branch name inside a notebook within a Repo? Perhaps something in dbutils.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132

1 Answers1

7

You can use Repos API for that - specifically the Get command. You can extract notebook path from the notebook context available via dbutils, and then do the two queries:

  1. Get repo ID by path via Workspace API (repo path always consists of 3 components - /Repos, directory (for user or custom), and actual repository name)
  2. Fetch repo data

Something like this:

import json
import requests

ctx = json.loads(
  dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson())

notebook_path = ctx['extraContext']['notebook_path']
repo_path = '/'.join(notebook_path.split('/')[:4])
api_url = ctx['extraContext']['api_url']
api_token = "your_PAT_token"

repo_dir_data = requests.get(f"{api_url}/api/2.0/workspace/get-status",  
                             headers = {"Authorization": f"Bearer {api_token}"},
                             json={"path": repo_path}).json()
repo_id = repo_dir_data['object_id']
repo_data = requests.get(f"{api_url}/api/2.0/repos/{repo_id}",  
                         headers = {"Authorization": f"Bearer {api_token}"}
                        ).json()
Alex Ott
  • 80,552
  • 8
  • 87
  • 132
  • I was thinking about this approach too since I already worked with Repos API, but I hoped there would be an easier way. Thanks anyway, I will probably use it. – Stanislav Žoldak Nov 20 '21 at 18:54
  • may I ask you - why do you need this? Something like, tracking the code that was used to build the model? Maybe we can build-in easier way? – Alex Ott Nov 20 '21 at 19:18
  • Someone in my team asked if this could be done, so I started looking at it, because I was interested in it myself. I'll ask about it on Monday. – Stanislav Žoldak Nov 21 '21 at 01:47
  • 1
    I love how can copy paste this code and it works. I'd also add: branch = ctx['extraContext']['mlflowGitReference'] – SoloDolo Sep 27 '22 at 23:40