Without some real (mock) data, I don't think it's best to use regex. Besides, this can be simplified greatly by treating it as a CSV.
First for this test and so others can try it as well we will make a sample file
$tempfile = New-TemporaryFile
$data = @'
1111 xxxx yyyyyyyyyyyyyyyy 2222 zzzz wwwwwww
3333 aaaa bbbbbbbbbbbbbbbb 4444 cccc ddddddd
'@ | Set-Content $tempfile -Encoding utf8
Next, we read the info while replacing spaces with commas. If your data has spaces, then of course this would need adjustment or not be used, depending.
(Get-Content $tempfile) -replace '\s+',','
1111,xxxx,yyyyyyyyyyyyyyyy,2222,zzzz,wwwwwww
3333,aaaa,bbbbbbbbbbbbbbbb,4444,cccc,ddddddd
Now we know our data is proper, import as CSV while specifying headers. You could provide meaningful headers since you know what the info is.
(Get-Content $tempfile) -replace '\s+',',' | ConvertFrom-Csv -Header a,b,c,d,e,f
a : 1111
b : xxxx
c : yyyyyyyyyyyyyyyy
d : 2222
e : zzzz
f : wwwwwww
a : 3333
b : aaaa
c : bbbbbbbbbbbbbbbb
d : 4444
e : cccc
f : ddddddd
If you only want certain columns, simply select only those.
(Get-Content $tempfile) -replace '\s+',',' |
ConvertFrom-Csv -Header a,b,c,d,e,f | Select b,c,e,f -OutVariale data
b c e f
- - - -
xxxx yyyyyyyyyyyyyyyy zzzz wwwwwww
aaaa bbbbbbbbbbbbbbbb cccc ddddddd
Now assign it accordingly
$data | select b,e -OutVariable data1
b e
- -
xxxx zzzz
aaaa cccc
$data | select c,f -OutVariable data2
c f
- -
yyyyyyyyyyyyyyyy wwwwwww
bbbbbbbbbbbbbbbb ddddddd
Edit
Your meaningful data changes my recommendation. Also curious where the extra columns are as it's not like what you showed initially. Based on the data you've shown, I have three different options.
Again, starting with some sample data
$tempfile = New-TemporaryFile
@'
Car 1FMZU77E22UC18440 Honda CiviC
Car SCBCR63W55C024793 GM Colorado
Truck SZXYR63W55C165487 GM some model
'@ | Set-Content $tempfile -Encoding utf8
Option 1 - ConvertFrom-String
ConvertFrom-String can parse the data based off a template you provide as "training" data.
$template = @'
{Type*:abc} {VIN:ABCDEFGH123456789} {Model:some model}
{Type*:abcde} {VIN:sample} {Model:a some model}
'@
Now we apply the template. One aspect of this that makes it better than regex line by line, is you can use the fast -Raw
parameter of get content which is very fast.
get-content $tempfile -Raw |
ConvertFrom-String -TemplateContent $template -OutVariable data
Type VIN Model
---- --- -----
Car 1FMZU77E22UC18440 Honda CiviC
Car SCBCR63W55C024793 GM Colorado
Truck SZXYR63W55C165487 GM some model
If you don't need the type, just ignore it. Set it to your variables like
$data1,$data2 = $data.vin,$data.model
Contents of the variables $data1
and $data2
$data2
Honda CiviC
GM Colorado
GM some model
$data1
1FMZU77E22UC18440
SCBCR63W55C024793
SZXYR63W55C165487
Option 2 - Regex Get-Content / line by line
Get-Content $tempfile | foreach {
if($_ -match '(?<=[\s]{3})(\w*)[\s]{5}(.*)$')
{
[PSCustomObject]@{
Model = $matches.2
VIN = $matches.1
}
}
}
once again set the variables as you desire
$data1,$data2 = $data.vin,$data.model
Option 3 - Select-String (probably closer to Option 1 speed)
Select-String -path $tempfile -Pattern '(?<=[\s]{3})(\w*)[\s]{5}(.+)' -AllMatches | ForEach {
$_.Matches | foreach {
[PSCustomObject]@{
Model = $_.groups[2].value
VIN = $_.groups[1].value
}
}
} -OutVariable data