0

I am developing have a few relatively complex ADF pipelines and I like to track my changes.

Traditionally with programming languages I keep my code in a git repository and track changes using branches.

How can I do the same for ADF pipelines? What is the recommended directory structure for ADF code?

Allan Xu
  • 7,998
  • 11
  • 51
  • 122
  • 4
    Have you checked out the [source control of ADF](https://learn.microsoft.com/en-us/azure/data-factory/source-control)? It should be what you are looking for that it can track the changes in underlying json using git. But to be honest, from my personal experience, it is not very useful for diff as the json do not have any linebreaks and its is hard to find the diff between versions. – ray May 13 '22 at 17:37
  • @ray, thank you. I am not sure why did I missed that. – Allan Xu May 13 '22 at 17:42
  • Thank you, @ray & Allan Xu, for your valuable discussion. If the issue is resolved, can you please post it as an answer, so that it benefits other community members? – NiharikaMoola-MT May 18 '22 at 07:37

1 Answers1

1

Source control of ADF should be what you are looking for. It can track the changes in underlying JSON using git.

It also provides some useful features when you integrate with a Github / Azure DevOps Repo:

  1. "auto" save/commit; every time when you save your work, the change is created as git commit as incremental change and you can easily revert it. In this way, you do not have to publish your changes to persist your changes(In some case like developing a new feature on a feature branch, publishing may not an option)
  2. You can leverage branch protection in Github / Azure DevOps Repo to perform code review, code merge... before publishing the ADF code

But to be honest, from my personal experience, it is not very useful for diff as the JSON does not have any linebreaks and it is hard to find the diff between versions.

ray
  • 11,310
  • 7
  • 18
  • 42
  • I also struggle with diffing large scripts in the ADF pipeline JSON file. As a compromise I am keeping full copies of each SQL script in the same repo in .sql files for easy reviewing/diffing, although this is obviously a bit error prone if I forget to keep the changes in sync between the scripts files and the pipeline itself. A more robust approach like this might be beneficial so the script files are loaded by ADF then it mitigates the manual copy/paste issue: https://stackoverflow.com/questions/72162421/can-i-run-sql-script-files-located-in-a-git-repo-on-azure-data-factory – Jon.Mozley May 31 '23 at 09:14