2

I have a file full of lines like this. There are always the same amount of semicolons and there is always a 3 character string before the first semicolon.

RT2;SS1234567;INV RED;13.06.2021;14.06.2021;154;Out;
RT2;XX1234567;INV RED;04.05.2021;14.06.2021;1472;Out;
RT2;FF1234567;INV RED;04.05.2021;14.06.2021;1472;Out;
RT2;LL1234567;INV RED;13.05.2021;14.06.2021;1472;Out;

I want to remove the beginning 3 character string and the semicolon from each row.

This is how I'm pulling in the file, it's full of blank lines and rows I need to remove

#import the file removing the first row and removing blank rows
$inFile = Get-Content -Path ($InFileDir + $InFileName)|Select-Object -Skip 1|? {$_.trim() -ne "" }

# Removes the (12334 rows affected) line that's added by sql
$inFile = $inFile|Where-Object {$_ -notlike '(*)'}

# Source file is two different sql table exports appended to each other, store the different headers
$header1 = 'RT1;Polref;Tranaction;Eff Dte;Process Dte;Fund;Movement;'
$header2 = 'RT3;Polref;Tranaction;Eff Dte;Process Dte;Fund;Qty;Amt;'

#Get some file positions
$RowBeforeheader2Index = $InFile.IndexOf($header2) -1
$header1Index = $InFile.IndexOf($header1)
$header2Index = $InFile.IndexOf($header2)
$LastRow = $inFile.Length -1

$outFile[$header1Index..$RowBeforeheader2Index]

foreach ($row in $outFile)
{
    //perform a substring on the row and add to $var
}

$var|Out-file 'C:\temp\output.txt'

I'm not sure how to fill out the foreach loop to achieve my desired result. (I'm just calling it $var for for this example...I'm not that unimaginative)

EDIT:

I ended up changing $var to a list and used the following code in the foreach loop

$var = New-Object System.Collections.Generic.List[System.Object]

foreach($row in $outFile)
{
    $var.Add($row.Substring(4))
}
Christopher Long
  • 854
  • 4
  • 11
  • 21

4 Answers4

2

Given that you can remove a fixed number of characters and assuming that each line has at least 4 characters, you can simply call .Substring() on your array of strings (lines):

# Sample input
$outFile = 'RT2;SS1234567', 'RT2;SS1234568', 'RT2;SS1234569'

# Remove the first 4 characters from each line (array element).
# (Use $var = ... to assign the output to a variable).
$outFile.Substring(4)

Note that even though $outFile is an array, the .Substring() method is called on each element, which is a PowerShell feature known as member-access enumeration.

mklement0
  • 382,024
  • 64
  • 607
  • 775
1

Try this -

$data = @"
RT2;SS1234567;INV RED;13.06.2021;14.06.2021;154;Out;
RT2;XX1234567;INV RED;04.05.2021;14.06.2021;1472;Out;
RT2;FF1234567;INV RED;04.05.2021;14.06.2021;1472;Out;
RT2;LL1234567;INV RED;13.05.2021;14.06.2021;1472;Out;
"@ | ConvertFrom-Csv -Delimiter ";" -Header @("col1","col2", "col3", "col4", "col5", "col6", "col7")

$data | Select-Object * -ExcludeProperty col1 | ConvertTo-Csv | Select-Object -Skip 2 | Set-Content $env:USERPROFILE\Desktop\output.csv

NOTE - Use Select-Object -Skip 2, if ConvertTo-Csv generates an additional column #TYPE Selected.System.Management.Automation.PSCustomObject, else you can use Select-Object -Skip 1.

Vivek Kumar Singh
  • 3,223
  • 1
  • 14
  • 27
  • Using `ConvertTo-Csv` is a good approach but your code will eat the first row from the data and leave all values quoted with double-quotes. and separated by commata. Try: `$data | select * -ExcludeProperty col1 | ConvertTo-Csv -NoTypeInformation -QuoteFields '' -Delimiter ';' | Select-Object -Skip 1` – Manuel Batsching Jun 18 '21 at 10:23
1

There are many ways to do that. If your operation is really as simple as deleting the first column, you can do it like this. Assuming the content of $outFile in your example corresponds to your listing and $var = @() has been set earlier in your script, you can put the following into your foreach loop:

$null,$row = $row -split ';' # Turn the string into an array and dump the first element.
$var += $row -join ';' # Turn the array into a string using ; as delimiter

The content of $var should then look like this:

SS1234567;INV RED;13.06.2021;14.06.2021;154;Out;
XX1234567;INV RED;04.05.2021;14.06.2021;1472;Out;
FF1234567;INV RED;04.05.2021;14.06.2021;1472;Out;
LL1234567;INV RED;13.05.2021;14.06.2021;1472;Out;
Manuel Batsching
  • 3,406
  • 14
  • 20
0

--replace

is the easiest way to remove the number of characters in each line of multiline text.

$content = $inFile = Get-Content -Path ($InFileDir + $InFileName)|Select-Object -Skip 1    
$content -replace "(?m)^.{4}"

multiline mode is enabled by the m flag, so that ^ and $ will match beginning of string and end of string multiple times. (as divided by \n)