3

I currently have a Powershell script that is iterating through bundled MT103 files. Currently, it checks for a flag and decided to forward the file or not. The problem is that sometimes, MT177 (undesired) information comes bundled with the desired file and the file gets forwarded to the drop point.

How can I modify my Powershell script to detect and split this file based on the delimiter which is '{-'.

An example of this is: Multiple payments are separated by a line break. For example:

{-
MT103 payment 1
-}
{-
MT103 payment 2
-}

The desire is to split this file into multiple files and then process them individually.

The resulting files should contain

{-
MT103 payment 1
-}
{-
MT103 payment 2
-}
desertnaut
  • 57,590
  • 26
  • 140
  • 166
trevoirwilliams
  • 459
  • 7
  • 20

3 Answers3

6
# Create sample input file:
@'
{-
MT103 payment 1
-}
{-
MT103 payment 2
-}
'@ > file.txt

$index = 1

# Split the file into blocks and write them to "outFile<index>.txt" files.
(Get-Content -Raw file.txt) -split '(?s)({-.+?-})\r?\n' -ne '' | 
  Set-Content -LiteralPath { 'outFile{0}.txt' -f $script:index++ }
  • Get-Content -Raw reads the entire input file into a single, multi-line string.
  • -split splits that string into blocks of {-...-} lines:

    • Regex (?s)({-.+?-})\r?\n captures a single block, followed by a newline; inline option s ((?s)) ensures that . also matches newlines, for multi-line matching.

      • Note that even though -split by default doesn't include what the separator regex matched in the resulting array, using a capture group ((...)) does cause inclusion of what it matches.

      • If you wanted to match more strictly by only finding {- and -} on their own lines, use the following regex instead: (?sm)(^{-$.+?^-}$)\r?\n

    • -ne '' filters out empty entries resulting from the -split operation.
  • Passing a delay-bind script block ({ ... }) to Set-Content's -LiteralPath parameter allows determining an output file path on a per-input object basis:

    • 'outFile{0}.txt' -f $script:index++ outputs outFile1.txt for the first string (block of lines), outFile2.txt for the second, and so on.

    • Because delay-bind script blocks run in a child scope, you cannot directly increment $index in the caller's scope:

      • $script:index is a convenient way to refer to the variable in the script scope.
      • However, if your code is inside a function, use the following, more robust - but more cumbersome - reference to whatever the parent scope is: (Get-Variable -Scope 1 index).Value++
      • See this answer for details.
mklement0
  • 382,024
  • 64
  • 607
  • 775
1

EDITED: As I understood you need to split with a delimiter and remove undesired data.

something like the following:

$Data = "{- MT103 payment 1 -} {- MT103 payment 2 -}"
[Collections.ArrayList]$Array = $Data.Split('{-')
for($i = 0;$i -lt $Array.Count;$i++) {
    if($Array[$i] -imatch "MT177") {
        $Array.RemoveAt($i)
        $i = 0
    }
}
#Print result
$Array
  • 1
    The original form of the question wasn't too clear, but it should be now. Your answer strips the opening delimiter (`{-`) and keeps the closing one (`-}`) - whereas it is now clear that _preserving both_ is desired. Also, the aspect of writing the results of the splitting to individual _files_ is missing – mklement0 Nov 04 '19 at 13:48
  • I take your point that the problem could have been explained a bit better. I did however outline the desire for multiple files: "The desire is to split this file into multiple files and then process them individually." – trevoirwilliams Nov 04 '19 at 14:38
1

This is the code i ended up with:

$Data = "{- MT103 payment 1 -} {- MT103 payment 2 -}"
[string[]]$Array = $Data.Split("{")
if ($Array.Count -gt 1) {
  for ($i = 1; $i -lt $Array.Count; $i++) {
    "{" + $Array[$i] | Out-File $destination-$i.fin
  }
}

I split the data on the opening brace '{' and then add it back to the resulting string content, then output a reconstructed string with the brace to an output file.

{- MT103 payment 1 -} 
{- MT103 payment 2 -}
mklement0
  • 382,024
  • 64
  • 607
  • 775
trevoirwilliams
  • 459
  • 7
  • 20