3

I have an application that generates 100's of text log files which are like

DaemonReruns=2|

Phase=|

Log=false|
DS=LOG_4|
Schema=LOLYY|
DBMS=mssql|
Host=abc.XYz.com|
IDs=xxxxx,xxxx

I need to select Host from these I tried

GC  C:\log_5.txt |
    Select-String -Pattern 'Host=\"([^\"]*)\"'

Gives no results, any help ?

marsze
  • 15,079
  • 5
  • 45
  • 61
Jon drew
  • 41
  • 6

6 Answers6

3

There aren't any quotes in your example input. Try this regex:

get-content C:\log_5.txt | foreach {
    if ($_ -match 'Host=([^|]+)') {
        $Matches.1
    }
}

Note: This actually returns the host names, not just the line.

marsze
  • 15,079
  • 5
  • 45
  • 61
  • Cool it works but not able to sort unique ` gci C:\logs| where{$_.Extension -like '*.txt' -or $_.Extension -like '*.proc'} |Foreach{Get-Content $_.FullName}| foreach { if ($_ -match 'Host=([^|]+)') { $data=$Matches.1 $data|Sort-Object -Unique } } ` – Jon drew Feb 26 '19 at 15:24
  • @Jondrew Put the sort at the very end of the pipeline: `... { $Matches.1 } } | sort -Unique` – marsze Feb 26 '19 at 15:30
  • it wont `foreach { if ($_ -match 'Host=([^|]+)') { $Matches.1 | sort -Unique } }` – Jon drew Feb 26 '19 at 15:40
  • 1
    @Jondrew `foreach { if ($_ -match 'Host=([^|]+)') { $Matches.1 } } | sort -Unique` – marsze Feb 26 '19 at 15:41
3

marsze's helpful answer fixes the problem with your regex and uses a ForEach-Object (foreach) call to extract and return matches via the -match operator and the automatic $Matches variable.

Here's a concise (and better-performing) alternative using the switch statement:

PS> switch -Regex -File C:\log_5.txt { 'Host=([^|]+)' { $Matches[1] } }
abc.XYz.com

Note that -File doesn't accept wildcard-based paths, however, so in order to process multiple file, you'll have to loop over them via Get-ChildItem or Convert-Path.

mklement0
  • 382,024
  • 64
  • 607
  • 775
2
((Get-Content -Path .\log_5.txt) -match 'Host=') -replace 'Host=',''

returns all the lines starting with Host=

TobyU
  • 3,718
  • 2
  • 21
  • 32
2

Just for fun ... the super-fast solution:

$regex = [Regex]::new('Host=([^|]+)', 'Compiled, IgnoreCase, CultureInvariant')
& {foreach ($line in [IO.File]::ReadLines("C:\log_5.txt")) {
    $m = $regex.Match($line)
    if ($m.Success) {
        $m.Groups[1].Value
    }
}}
marsze
  • 15,079
  • 5
  • 45
  • 61
  • 1
    Btw, it seems that what slows `switch -Regex` down relative to `[regex].Match()` is the additional effort of translating the match information into the `$Matches` hashtable. – mklement0 Feb 26 '19 at 15:07
  • 1
    @mklement0 Yeah PS is still a scripting language made to be easily usable, not fast. If I write the same in C# code, compile it with `Add-Type` and call that, it's a few dozen times faster. – marsze Feb 26 '19 at 15:20
  • 1
    Nice C# solution; another brief tangent: here's a fun pitfall with `switch -File`: https://github.com/PowerShell/PowerShell/issues/8988 – mklement0 Feb 26 '19 at 16:03
2

If your logs are huge, it could be worth the overhead of Add-Type, and the rest would be much faster:

Add-Type '
using System.IO;
using System.Collections.Generic;
using System.Text.RegularExpressions;

namespace PowerShell
{
    public class Tools
    {
        static Regex regex = new Regex(@"Host=([^|]+)", RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
        public static IEnumerable<string> GetHosts(string path)
        {
            foreach(var line in File.ReadLines(path))
            {
                var matches = regex.Match(line);
                if (matches.Success)
                {
                    yield return matches.Groups[1].Value;
                }
            }
        }
    }
}'

# call this for each log file (very fast)
[PowerShell.Tools]::GetHosts("C:\log_5.txt")
marsze
  • 15,079
  • 5
  • 45
  • 61
1

Other answers have the regex side covered well enough. Whenever I see little logs like this I always think about ConvertFrom-StringData which

converts a string that contains one or more key and value pairs into a hash table.

From: help ConvertFrom-StringData

In its basic form we just do something like this:

$pairs = Get-Content -Raw -File $pathtofile | ConvertFrom-StringData
[pscustomobject]$pairs

Which would give you a PowerShell object that you can interact with easily!

DS           : LOG_4|
Schema       : LOLYY|
IDs          : xxxxx,xxxx
Log          : false|
DBMS         : mssql|
Host         : abc.XYz.com|
Phase        : |
DaemonReruns : 2|

Doubtful that you need the trailing pipes. You can remove those with some regex or simpler string methods.

[pscustomobject](Get-Content -File $pathToFile | ForEach-Object{$_.trimend("|")} | Out-string | ConvertFrom-StringData)

[pscustomobject]((Get-Content -Raw -File $pathToFile) -replace "(?m)\|$" | ConvertFrom-StringData)

In any case this gives you more options as to how you need to deal with your data.

Community
  • 1
  • 1
Matt
  • 45,022
  • 8
  • 78
  • 119