1

I have 50 lines text file ($file1) like and i need to remove the characters starting from an specific character "/" until,the end of the line.

Sample text file:

| Area | vserver | file-id |connection-id | session-id | open-mode | path |

| manphsan01 | manphs101 | 9980 | 4278018043 | 5065142205921760710 | rw | Share01\Mandaue\Data01 |

| manphsan01 | manphs101 | 1790 | 4278020659 | 5065142205921763223 | rwd | FinanceDept\ARCHIVING |

| manphsan01 | manphs101 | 1824 | 4278020659 | 5065142205921763223 | rwd | Share01\Cebu\Year2022 |

| manphsan01 | manphs101 | 1976 | 4278020659 | 5065142205921763223 | rwd | SGSDept\General\Document |

My desired output sh0uld be like:

| Area | vserver | file-id |connection-id | session-id | open-mode | path |

| manphsan01 | manphs101 | 9980 | 4278018043 | 5065142205921760710 | rw | Share01 |

| manphsan01 | manphs101 | 1790 | 4278020659 | 5065142205921763223 | rwd | Finance |

| manphsan01 | manphs101 | 1824 | 4278020659 | 5065142205921763223 | rwd | Share01 |

| manphsan01 | manphs101 | 1976 | 4278020659 | 5065142205921763223 | rwd | SGSDept |

the command i used is like this:

$var = Get-content $file1

$var.Substring(0, $var.IndexOf('\')) | FT -AutoSize or 

$var.Substring(0, $var.IndexOf('backslash')) | FT -AutoSize

My command will work if my data is only 1 line but multiple lines it wont work. I am not sure why the 'backslash' is not showing on the command when i posted it.

ny ideas how to make this work?

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • i found a way to do it using foreach procedure. $a = Get-content $file1 Foreach($line in $a){ $line.Substring(0,$line.IndexOf('\')) } – tryingtocode Feb 07 '22 at 03:16
  • If you have already solved the problem, mind as well [self-answer](https://stackoverflow.com/help/self-answer#:~:text=If%20you%20have%20more%20than,of%20the%20Ask%20Question%20page.&text=Alternatively%2C%20you%20may%20go%20back,48%20hours%20to%20do%20so.) – Santiago Squarzon Feb 07 '22 at 04:55
  • 1
    [1] Your example shows a CSV format complete with headers. PLease open it in Notepad and copy the first couple of lines. Then paste that in your question so we can see what delimiter is used. [2] You say you want to remove _"until,the end of the line"_, but your question clearly shows you want to keep the final `|`, so I believe what you show us is **not** what the file actually looks like. [3] Bear in mind that using `.IndexOf()` can return -1 if the character you're looking for is not found and directly combining that with `.Substring()` wil throw an exception. – Theo Feb 07 '22 at 09:21
  • 1
    As an aside: `$var.IndexOf('backslash')` looks for substring `backslash` _verbatim_ . – mklement0 Feb 08 '22 at 20:21

1 Answers1

1

You can get away with plain-text processing if you can assume that only one field on each line of your structured text file contains \ and that it and everything after it - up until the next field delimiter, | - should be removed:

# Transforms all matching lines and outputs them.
# Pipe to Set-Content to save back to a file; use -Encoding as needed.
(Get-Content $file) -replace '\\.+?(?= \|)'

The above uses a -replace operation with a regex to remove the unwanted part of matching lines (lines that don't match are passed through as-is).

For an explanation of the regex and the ability to experiment with it, see this regex101.com page.


As for what you tried:

  • $var = Get-content $file1 stores the individual lines of file $file1 as an array in variable $var1.

  • To process the resulting lines one by one, you need a loop construct, such as a foreach statement or the ForEach-Object cmdlet; e.g. foreach ($line in $var) { ... }

  • While $line.Substring(0, $line.IndexOf('\')) works in principle, it will cause a statement-terminating error (exception) for every $line value that contains no \ character, as Theo notes, notably with your file's header line.

    • While this could easily be fixed with try { $line.Substring(0, $line.IndexOf('\')) } catch { $line }, the bigger problem is that it would remove everything through the end of the line, which contradicts your desired output, which shows that the next field seprator, | should be retained.

    • The above -replace operation fixes both these problems; note that it implicitly loops over the array of input lines and performs the replacement operation on each, returning an array of (potentially) transformed lines.

  • Also note that a formatting cmdlet such as Format-Table (-FT) should only be used for for-display output; it doesn't produce usable data - see this answer for more information; also, it has no formatting effect on strings.

mklement0
  • 382,024
  • 64
  • 607
  • 775