I am trying to capture the specific key value pairs from a text file having other data as well than key:value pattern using powershell. Can anyone help me out? I have tried the code so far with the help of internet as I am newbie to Powershell. Any help will be appreciated.
Source Text sample:
ResourceGroupName : DataLake-Gen2
DataFactoryName : dna-production-gen2
TriggerName : TRG_RP_Optimizely_Import
TriggerRunId : 08586050680855766354964895535CU57
TriggerType : ScheduleTrigger
TriggerRunTimestamp : 8/4/2020 10:59:59 AM
Status : Succeeded
TriggeredPipelines : {[PL_DATA_OPTIMIZELY_MART, 1f89fc3a-27b5-442e-9685-a444f751f607]}
Message :
Properties : {[TriggerTime, 8/4/2020 10:59:59 AM], [ScheduleTime, 8/4/2020 11:00:00 AM], [triggerObject, {
"name": "Trigger_421B8CAF-BE66-42CF-83DA-E3028693F304",
"startTime": "2020-08-04T10:59:59.8982174Z",
"endTime": "2020-08-04T10:59:59.8982174Z",
"scheduledTime": "2020-08-04T11:00:00Z",
"trackingId": "fdf58bb2-ecd5-4fe9-b2ef-d94fd71729c3",
"clientTrackingId": "08586050680855766354964895535CU57",
"originHistoryName": "08586050680855766354964895535CU57",
"code": "OK",
"status": "Succeeded"
}]}
AdditionalProperties : {[groupId, 08586050680855766354964895535CU57]}
ResourceGroupName : DataLake-Gen2
DataFactoryName : dna-production-gen2
TriggerName : TRG_RP_Optimizely_Import
TriggerRunId : 08586049816852049265494275953CU24
TriggerType : ScheduleTrigger
TriggerRunTimestamp : 8/5/2020 11:00:00 AM
Status : Succeeded
TriggeredPipelines : {[PL_DATA_OPTIMIZELY_MART, dd6b5beb-b7f6-44ef-8903-34c845003dfc]}
Message :
Properties : {[TriggerTime, 8/5/2020 11:00:00 AM], [ScheduleTime, 8/5/2020 11:00:00 AM], [triggerObject, {
"name": "Trigger_421B8CAF-BE66-42CF-83DA-E3028693F304",
"startTime": "2020-08-05T11:00:00.2662252Z",
"endTime": "2020-08-05T11:00:00.2662252Z",
"scheduledTime": "2020-08-05T11:00:00Z",
"trackingId": "ba223bbd-8cb2-40e8-951f-87130dbbbfe8",
"clientTrackingId": "08586049816852049265494275953CU24",
"originHistoryName": "08586049816852049265494275953CU24",
"code": "OK",
"status": "Succeeded"
}]}
AdditionalProperties : {[groupId, 08586049816852049265494275953CU24]}
Code used so far:
[CmdletBinding()]
Param(
[Parameter(Mandatory=$true)]
$path
)
function Format-LogFile {
[CmdletBinding()]
param (
$log
)
$targets = 'TriggerRunTimestamp','ResourceGroupName', 'DataFactoryName', 'TriggerName', 'TriggerRunId', 'TriggerType', 'Status'
[System.Collections.ArrayList]$lines = @()
$log | ForEach-Object {
$line = $_
$targets | ForEach-Object {
if ($line.Contains($_) -and $line -notin $lines) {
$lines.Add($line) | Out-Null
}
}
}
# $lines[0] = $lines[0].TrimStart("JournalSMS ")
# return $lines
}
function Get-LogFields {
[CmdletBinding()]
param (
$lines
)
$targets = 'TriggerRunTimestamp','ResourceGroupName', 'DataFactoryName', 'TriggerName', 'TriggerRunId', 'TriggerType', 'Status'
$matchs = $lines | Select-String -Pattern "(?<=(\s||\b))[A-Z][\s\[A-Z]/]+?\s*?\:\s+[^\s\b]+" -AllMatches
$dict = @{}
$matchs.Matches | ForEach-Object {
$val = $_.Value
$arr = $val.Split("")
if ($arr[0].Trim() -in $targets) {
$dict.Add($arr[0].Trim(), $arr[1].Trim())
}
}
return $dict
}
$log = get-content 'D:\\output.txt'
$path = "D:\\output.txt"
$info = Get-ChildItem -File -Recurse -Path $path | ForEach-Object {
$log = Get-Content $_.FullName -Encoding Default
$lines = Format-LogFile $log
$dict = Get-LogFields $lines
$values = New-Object -TypeName psobject -Property $dict
return $values
}
# $info |
# Select-Object @{name='TriggerRunTimestamp';expression={$_.'TriggerRunTimestamp'}},
# @{name='ResourceGroupName';expression={$_."ResourceGroupName"}},
# @{name='DataFactoryName';expression={$_.'DataFactoryName'}},
# @{name='TriggerName';expression={$_.'TriggerName'}},
# @{name='TriggerRunId';expression={$_.'TriggerRunId'}}
# @{name='TriggerType';expression={$_.'TriggerType'}}
# @{name='Status';expression={$_.'Status'}}|
# Export-Csv -Encoding UTF8 -Path .\result.csv -Force
$info |
Select-Object 'TriggerRunTimestamp', "ResourceGroupName", 'DataFactoryName',
'TriggerName', 'TriggerRunId', 'TriggerType', 'Status' |
ConvertTo-CSV -Delimiter ";" -NoTypeInformation |
% {$_.Replace('"','')} |
Set-Content -Path 'D:\\result.csv' -Force
# Export-Csv -Encoding UTF8 -Path .\result.csv -Force
Expected Output:
TriggerRunTimestamp ResourceGroupName DataFactoryName TriggerName TriggerRunId TriggerType Status TriggeredPipeline Properties_TriggerTime Properties_ScheduleTime triggerObject_name triggerObject_startTime triggerObject_endTime triggerObject_scheduledTime 8/4/2020 10:59 DataLake-Gen2 dna-production-gen2 TRG_RP_Optimizely_Import 08586050680855766354964895535CU57 ScheduleTrigger Succeeded PL_DATA_OPTIMIZELY_MART 8/4/2020 10:59 8/4/2020 11:00 Trigger_421B8CAF-BE66-42CF-83DA-E3028693F304 2020-08-04T10:59:59.8982174Z 2020-08-04T10:59:59.8982174Z 2020-08-04T11:00:00Z
NOTE: Bold values are the column headers and values are in plain text.
Help Much Needed !!
Thanks