3

I'm interested in querying the pipeline consumption report that is available from the Data Factory monitor. Is there a table on Log Analytics or PowerShell cmdlet that would return this information? I checked the ADFv2 PowerShell module but couldn't find any. My goal is to aggregate the information available in this report to identify what are the most costly pipelines.

enter image description here

reference: https://techcommunity.microsoft.com/t5/azure-data-factory/new-adf-pipeline-consumption-report/ba-p/1394671

Thank you

Everton Oliveira
  • 810
  • 7
  • 15

1 Answers1

5

Doing more research someone pointed me to a GitHub page where the product team posted a PowerShell script to find part of what I was looking for {1}. So I did some modifications to the script to have the output that I needed. With the output below I can extract the values from the MS calculator to get an estimated cost for each pipeline run. {2}

$startTime = "21/6/2021 7:00:00"
$endTime = "21/6/2021 10:00:00"
$adf = '<data factory name>'
$rg = '<resrouce group name>'
    

$outputObj = @()
$pipelineRuns = Get-AzDataFactoryV2PipelineRun -ResourceGroupName $rg -DataFactoryName $adf -LastUpdatedAfter $startTime -LastUpdatedBefore $endTime

# loop through all pipelines and child activities to return billable information
foreach ($pipelineRun in $pipelineRuns) {
    $activtiyRuns = Get-AzDataFactoryV2ActivityRun -ResourceGroupName $rg -DataFactoryName $adf -pipelineRunId $pipelineRun.RunId -RunStartedAfter $startTime -RunStartedBefore $endTime

    foreach ($activtiyRun in $activtiyRuns) {
        if ($null -ne $activtiyRun.Output -and $null -ne $activtiyRun.Output.SelectToken("billingReference.billableDuration")) {            
            
            $obj = @()
            $obj = $activtiyRun.Output.SelectToken("billingReference.billableDuration").ToString() | ConvertFrom-Json
            $obj | Add-Member -MemberType NoteProperty -Name activityType -value $activtiyRun.Output.SelectToken("billingReference.activityType").ToString()
            $obj | Add-Member -MemberType NoteProperty -Name pipelineName -value $pipelineRun.PipelineName
            $obj | Add-Member -MemberType NoteProperty -Name activtiyRuns -value $activtiyRuns.Count             

            $outputObj += $obj
        }
        else {}
    }
}

# output aggregated result set as table
$groupedObj = $outputObj | Group-Object -Property pipelineName, activityType, meterType
$groupedObj | ForEach-Object {
    $value = $_.name -split ', '
    New-Object psobject -Property @{ 
                               
        activityType              = $value[1];
        meterType                 = $value[2];
        pipelineName              = $value[0];
        executionHours            = [math]::Round(($_.Group | Measure-object -Property duration -sum).Sum, 4)
        orchestrationActivityRuns = $groupedObj.group.activtiyRuns[0]
    } 
} | Sort-Object -Property meterType | Format-Table

Output sample:

enter image description here

Consumption report from the Data Factory monitor

enter image description here

reference:

  1. https://github.com/Azure/Azure-DataFactory/tree/main/SamplesV2/PastRunDetails#simple-script-that-prints--activity-level-run-details-in-45-day-range {1}
  2. https://azure.microsoft.com/en-us/pricing/calculator/?service=data-factory%2F {2}
Everton Oliveira
  • 810
  • 7
  • 15