33

I am having some issues trying to match a certain config block (multiple ones) from a file. Below is the block that I'm trying to extract from the config file:

ap71xx 00-01-23-45-67-89
 use profile PROFILE
 use rf-domain DOMAIN
 hostname ACCESSPOINT
 area inside
!

There are multiple ones just like this, each with a different MAC address. How do I match a config block across multiple lines?

David Clarke
  • 12,888
  • 9
  • 86
  • 116
Scott
  • 664
  • 1
  • 5
  • 8
  • Is it always `ap71xx`? Also, will `xx` always be numbers or something like that? – Nick Sep 24 '12 at 21:48
  • There are different "beginning" points, possibly, but the data that I'm looking to extract will always start with "ap71xx." And the "xx" aren't wildcards, they are literally two x's. – Scott Sep 24 '12 at 23:28

5 Answers5

67

The first problem you may run into is that in order to match across multiple lines, you need to process the file's contents as a single string rather than by individual line. For example, if you use Get-Content to read the contents of the file then by default it will give you an array of strings - one element for each line. To match across lines you want the file in a single string (and hope the file isn't too huge). You can do this like so:

$fileContent = [io.file]::ReadAllText("C:\file.txt")

Or in PowerShell 3.0 you can use Get-Content with the -Raw parameter:

$fileContent = Get-Content c:\file.txt -Raw

Then you need to specify a regex option to match across line terminators i.e.

  • SingleLine mode (. matches any char including line feed), as well as
  • Multiline mode (^ and $ match embedded line terminators), e.g.
  • (?smi) - note the "i" is to ignore case

e.g.:

C:\> $fileContent | Select-String '(?smi)([0-9a-f]{2}(-|\s*$)){6}.*?!' -AllMatches |
        Foreach {$_.Matches} | Foreach {$_.Value}

00-01-23-45-67-89
 use profile PROFILE
 use rf-domain DOMAIN
 hostname ACCESSPOINT
 area inside
!
00-01-23-45-67-89
 use profile PROFILE
 use rf-domain DOMAIN
 hostname ACCESSPOINT
 area inside
!

Use the Select-String cmdlet to do the search because you can specify -AllMatches and it will output all matches whereas the -match operator stops after the first match. Makes sense because it is a Boolean operator that just needs to determine if there is a match.

TessellatingHeckler
  • 27,511
  • 4
  • 48
  • 87
Keith Hill
  • 194,368
  • 42
  • 353
  • 369
  • Excellent! Thank you for your help! That did the trick. Earlier I was playing around with the regex string(s) I didn't escape the "!" like you have above...so that was another problem! Thank you for your help! – Scott Sep 24 '12 at 23:40
  • 2
    Thank you, the `[io.file]::ReadAllText()` method was what I was missing – David Clarke May 08 '15 at 00:10
  • or `| % matches | % value` – js2010 Sep 02 '21 at 04:31
6

In case this may still be of value to someone and depending on the actual requirement, the regex in Keith's answer doesn't need to be that complicated. If the user simply wants to output each block the following will suffice:

$fileContent = [io.file]::ReadAllText("c:\file.txt")
$fileContent |
    Select-String '(?smi)ap71xx[^!]+!' -AllMatches |
    %{ $_.Matches } |
    %{ $_.Value }

The regex ap71xx[^!]*! will perform better and the use of .* in a regular expression is not recommended because it can generate unexpected results. The pattern [^!]+! will match any character except the exclamation mark, followed by the exclamation mark.

If the start of the block isn't required in the output, the updated script is:

$fileContent |
    Select-String '(?smi)ap71xx([^!]+!)' -AllMatches |
    %{ $_.Matches } |
    %{ $_.Groups[1] } |
    %{ $_.Value }

Groups[0] contains the whole matched string, Groups[1] will contain the string match within the parentheses in the regex.

If $fileContent isn't required for any further processing, the variable can be eliminated:

[io.file]::ReadAllText("c:\file.txt") |
    Select-String '(?smi)ap71xx([^!]+!)' -AllMatches |
    %{ $_.Matches } |
    %{ $_.Groups[1] } |
    %{ $_.Value }
David Clarke
  • 12,888
  • 9
  • 86
  • 116
2

This regex will search for the text ap followed by any number of characters and new lines ending with a !:

(?si)(a).+?\!{1}

So I was a little bored. I wrote a script that will break up the text file as you described (as long as it only contains the lines you displayed). It might work with other random lines, as long as they don't contain the key words: ap, profile, domain, hostname, or area. It will import them, and check line by line for each of the properties (MAC, Profile, domain, hostname, area) and place them into an object that can be used later. I know this isn't what you asked for, but since I spent time working on it, hopefully it can be used for some good. Here is the script if anyone is interested. It will need to be tweaked to your specific needs:

$Lines = Get-Content "c:\test\test.txt"
$varObjs = @()
for ($num = 0; $num -lt $lines.Count; $num =$varLast ) {
    #Checks to make sure the line isn't blank or a !. If it is, it skips to next line
    if ($Lines[$num] -match "!") {
        $varLast++
        continue
    }
    if (([regex]::Match($Lines[$num],"^\s.*$")).success) {
        $varLast++
        continue
    }
    $Index = [array]::IndexOf($lines, $lines[$num])
    $b=0
    $varObj = New-Object System.Object
    while ($Lines[$num + $b] -notmatch "!" ) {
        #Checks line by line to see what it matches, adds to the $varObj when it finds what it wants.
        if ($Lines[$num + $b] -match "ap") { $varObj | Add-Member -MemberType NoteProperty -Name Mac -Value $([regex]::Split($lines[$num + $b],"\s"))[1] }
        if ($lines[$num + $b] -match "profile") { $varObj | Add-Member -MemberType NoteProperty -Name Profile -Value $([regex]::Split($lines[$num + $b],"\s"))[3] }
        if ($Lines[$num + $b] -match "domain") { $varObj | Add-Member -MemberType NoteProperty -Name rf-domain -Value $([regex]::Split($lines[$num + $b],"\s"))[3] }
        if ($Lines[$num + $b] -match "hostname") { $varObj | Add-Member -MemberType NoteProperty -Name hostname -Value $([regex]::Split($lines[$num + $b],"\s"))[2] }
        if ($Lines[$num + $b] -match "area") { $varObj | Add-Member -MemberType NoteProperty -Name area -Value $([regex]::Split($lines[$num + $b],"\s"))[2] }
        $b ++
    } #end While
    #Adds the $varObj to $varObjs for future use
    $varObjs += $varObj
    $varLast = ($b + $Index) + 2
}#End for ($num = 0; $num -lt $lines.Count; $num = $varLast)
#displays the $varObjs
$varObjs
James Wiseman
  • 29,946
  • 17
  • 95
  • 158
Nick
  • 4,302
  • 2
  • 24
  • 38
1

To me, a very clean and simple approach is to use a multiline bloc regex, with named captures, like this:

# Based on this text configuration:
$configurationText = @"
ap71xx 00-01-23-45-67-89
 use profile PROFILE
 use rf-domain DOMAIN
 hostname ACCESSPOINT
 area inside
!
"@

# We can build a multiline regex bloc with the strings to be captured.
# Here, i am using the regex '.*?' than roughly means 'capture anything, as less as possible'
# A more specific regex can be defined for each field to capture.
# ( ) in the regex if for defining a group
# ?<> is for naming a group
$regex = @"
(?<userId>.*?) (?<userCode>.*?)
 use profile (?<userProfile>.*?)
 use rf-domain (?<userDomain>.*?)
 hostname (?<hostname>.*?)
 area (?<area>.*?)
!
"@

# Lets see if this matches !
if($configurationText -match  $regex)
{
    # it does !    
    Write-Host "Config text is successfully matched, here are the matches:"
    $Matches
}
else
{
    Write-Host "Config text could not be matched."
}

This script outputs the following:

PS C:\Users\xdelecroix> C:\FusionInvest\powershell\regex-capture-multiline-stackoverflow.ps1
Config text is successfully matched, here are the matches:

Name                           Value                                                                                    
----                           -----                                                                                    
hostname                       ACCESSPOINT                                                                              
userProfile                    PROFILE                                                                                  
userCode                       00-01-23-45-67-89                                                                        
area                           inside                                                                                   
userId                         ap71xx                                                                                   
userDomain                     DOMAIN                                                                                   
0                              ap71xx 00-01-23-45-67-89...

For more flexibility, you can use Select-String instead of -match, but this is not really important here, in the context of this sample.

xav
  • 11
  • 2
0

Here's my take. If you don't need the regex, you can use -like or .contains(). The question never says what the search pattern is. Here's an example with a windows text file.

$file = (get-content -raw file.txt) -replace "`r"  # avoid the line ending issue

$pattern = 'two
three
f.*' -replace "`r"

# just showing what they really are
$file -replace "`r",'\r' -replace "`n",'\n'
$pattern -replace "`r",'\r' -replace "`n",'\n'

$file -match $pattern

$file | select-string $pattern -quiet 
js2010
  • 23,033
  • 6
  • 64
  • 66