0

I have some txt data like this:

0.0.0.1_03_1          
0.0.0.1_03            
0.0.0.1_02_2_1_3_4          
0.0.0.1_02_1          
0.0.0.1_02            
0.0.0.1_01_1          
0.0.0.1_01  

What I want to achieve is to separate to two variables (0.0.0.1 and the rest) I want to split only by first '_' and to kept leading zeros (01 for example) I am doing like:

Get-Content $SourceTxtDbFile | 
  ConvertFrom-String -Delimiter "_" -PropertyNames DbVersion, ScriptNumber

but the result neither has leading zeros nor are the lines split they way I want them to.

mklement0
  • 382,024
  • 64
  • 607
  • 775
Stefan0309
  • 1,602
  • 5
  • 23
  • 61
  • Why not just: `Get-Content $SourceTxtDbFile | ForEach {$DbVersion, $ScriptNumber = $_.Slpit('_', 2) ...` ? – iRon Nov 22 '18 at 14:34
  • @iRon where did you defined DbVersion and ScriptNumber? It is not the same as COnvertFrom-String function.. – Stefan0309 Nov 22 '18 at 14:36
  • 1
    As an aside: [`ConvertFrom-String`](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/convertfrom-string) provides separator-based parsing as well as _heuristics_-based parsing based on templates containing _example values_. The separator-based parsing applies automatic type conversions you cannot control, and the template language is poorly documented, with the exact behavior hard to predict - it's best to avoid this cmdlet altogether. Also note that it's not available in PowerShell _Core_. – mklement0 Nov 22 '18 at 15:13
  • If you take a different approach (as [my answer](https://stackoverflow.com/a/53434247/6811411) to your previous question suggests) it might not be neccessary to split your `$SourceTxtDbFile` at all. –  Nov 22 '18 at 15:42

3 Answers3

6

Limit the number of splits with .Split($separator, $count) and then make your own output objects:

Get-Content D:\test.txt | ForEach-Object {

    $Left, $Right = $_.split('_', 2)

    [PsCustomObject]@{ 
        DbVersion    = $Left.Trim()
        ScriptNumber = $Right.Trim()
    } 
}
TessellatingHeckler
  • 27,511
  • 4
  • 48
  • 87
  • I am receiving an error: You cannot call a method on a null-valued expression.FullyQualifiedErrorId : InvokeMethodOnNull – Stefan0309 Nov 22 '18 at 14:40
  • Apparently not all rows contain a `_`, try: "$Left".Trim() and "$Right".Trim() – iRon Nov 22 '18 at 14:43
  • @iRon Now, I ahve none output. Not any errors but not output as well. – Stefan0309 Nov 22 '18 at 14:44
  • Check your input file. – iRon Nov 22 '18 at 14:45
  • @iRon I am debugging right now, line: $Left, $Right = $_.split('_', 2) passes. So it isnt about the file..it breaks on [PsCustomObject] – Stefan0309 Nov 22 '18 at 14:47
  • @Stefan0309 run `$PSVersionTable.PSVersion.ToString()` and see what powershell version you are using; if it doesn't like PSCustomObject it might be quite old. Current on Windows is version 5.1... – TessellatingHeckler Nov 22 '18 at 14:50
  • @TessellatingHeckler I am using 5.1 powershell version, I can see it at the bottom of my VS Code :) I think the problem is when it try to trim null value...maybe empty row in txt.. – Stefan0309 Nov 22 '18 at 14:53
  • @TessellatingHeckler It works now. I put it into a variable whole thing like $files = Get-Content... and I had empty lines in my txt. Thanks! Just how to sleect one column with Select-Object? – Stefan0309 Nov 22 '18 at 14:59
  • `Get-Content $SourceTxtDbFile | ForEach-Object {...` *as in the answer* `...} | Where-Object {$_.DbVersion -eq "1.2.0.0"} | Select-Object {"$($_.ScriptNumber)te"}`, If this doesn't work, try to remove the `Where-Object {...}` cmdlet and see what happens. – iRon Nov 22 '18 at 15:04
  • @iRon in this situation I need something like: Select-Object DbVersion_ScriptNumber – Stefan0309 Nov 22 '18 at 15:13
  • Ok, back to my first suggestion: `Get-Content $SourceTxtDbFile | ForEach {$DbVersion, $ScriptNumber = $_.Slpit('_', 2); "$DbVersion_$ScriptNumber"}` – iRon Nov 22 '18 at 15:17
  • Or: `Get-Content $SourceTxtDbFile | ForEach {$DbVersion, $ScriptNumber = $_.Slpit('_', 2); If ($DbVersion -eq '1.2.0.0') {"$($DbVersion)_$ScriptNumber"}}`, bracket for `$($DbVersion)` are required because a "_" can be used for a variable name. – iRon Nov 22 '18 at 15:23
  • @Stefan0309 I don't understand, why do you want to split them out and then join them back together again? That seems like the same as doing nothing.. – TessellatingHeckler Nov 22 '18 at 17:57
  • As an aside, @iRon: `$($DbVersion)` works, but if disambiguating a variable name is all that is needed, `${DbVersion}` is both more efficient and more readable. TessellatingHeckler, my _guess_ is that _in effect_ Stefan just wants to trim everything starting with the _2nd_ `_`, so that `0.0.0.1_02_2_1_3_4 ` turns into `0.0.0.1_02`, for instance. – mklement0 Nov 22 '18 at 20:09
1

TessellatingHeckler's helpful answer shows you how to use the .Split() method to perform separator-based splitting that limits the number of tokens returned, which in his solution only splits by the 1st _ instance, to return a total of 2 tokens.

As an aside: you can also use PowerShell's own -split operator, whose use does have its advantages:

$_ -split '_', 2 # in this case, same as: $_.split('_', 2) 

That said, your later comments suggest that you may be looking to simply remove everything after the 2nd _ instance from your input strings.

$dbVersion, $scriptNumber, $null  = $_ -split '_', 3 # -> e.g., '0.0.0.1', 03', '1'

Note how specifying $null as the variable to receive the 3rd token effective discards that token, given that we're not interested in it.

To re-join the resulting 2 tokens with _, it's simplest to use the -join operator:

$dbVersion, $scriptNumber -join '_'

To put it all together:

# Sample array of input lines.
$lines=@'
0.0.0.1_03_1
0.0.0.1_03
0.0.0.1_02_2_1_3_4
0.0.0.1_02_1
0.0.0.1_02
0.0.0.1_01_1
0.0.0.1_01
'@ -split '\r?\n'

# Use Get-Content $SourceTxtDbFile instead of $lines in the real world.
$lines | ForEach-Object {
  # Split by the first two "_" and save the first two tokens.      
  $dbVersion, $scriptNumber, $null = $_ -split '_', 3
  # Re-join the first two tokens with '_'and output the result.
  $dbVersion, $scriptNumber -join '_'
}

With your sample input, this yields:

0.0.0.1_03
0.0.0.1_03
0.0.0.1_02
0.0.0.1_02
0.0.0.1_02
0.0.0.1_01
0.0.0.1_01
mklement0
  • 382,024
  • 64
  • 607
  • 775
0

An alternative RegEx approach:

> gc .\file.txt|?{$_ -match "^([^_]+)_(.*) *$"}|%{[PSCustomObject]@{DBVersion=$Matches[1];ScriptNumber=$Matches[2]}}

DBVersion ScriptNumber
--------- ------------
0.0.0.1   03_1
0.0.0.1   03
0.0.0.1   02_2_1_3_4
0.0.0.1   02_1
0.0.0.1   02
0.0.0.1   01_1
0.0.0.1   01

The same without aliases:

Get-Content .\file.txt|
  Where-Object {$_ -match"^([^_]+)_(.*) *$"} | 
    ForEach-Object {
      [PSCustomObject]@{
        DBVersion   = $Matches[1]
        ScriptNumber= $Matches[2]
      }
    }

The RegEx "^([^_]+)_(.*) *$" also removes trailing spaces from your posted sample lines.