1

I apologise for asking the very basic question as I am beginner in Scripting. i was wondering why i am getting different result from two different source with the same formatting. Below are my sample

file1.txt

 Id Name                      Members                      
122 RCP_VMWARE-DMZ-NONPROD    DMZ_NPROD01_111        
                              DMZ_NPROD01_113        
123 RCP_VMWARE-DMZ-PROD       DMZ_PROD01_110         
                              DMZ_PROD01_112         
124 RCP_VMWARE-DMZ-INT.r87351 DMZ_TEMPL_210.r        
                              DMZ_DECOM_211.r        
125 RCP_VMWARE-LAN-NONPROD    NPROD02_20             
                              NPROD03_21             
                              NPROD04_22             
                              NPROD06_24           

file2.txt

Id Name       Members             
4  HPUX_PROD HPUX_PROD.3
             HPUX_PROD.4
             HPUX_PROD.5

i'm trying to display the Name column and with this code i'm able to display the file1.txt correctly.

PS C:\Share> gc file1.txt |Select-Object -skip 1 | foreach-object { $_.split(" ")[1]} | ? {$_.trim() -ne "" }
RCP_VMWARE-DMZ-NONPROD
RCP_VMWARE-DMZ-PROD
RCP_VMWARE-DMZ-INT.r87351
RCP_VMWARE-LAN-NONPROD

However with the file2 im getting a different output.

PS C:\Share> gc .\file2.txt |Select-Object -skip 1 | foreach-object { $_.split(" ")[1]} | ? {$_.trim() -ne "" }
4

changing the code to *$_.split(" ")[2]}* helps to display the output correctly

However, i would like to have just 1 code which can be apply for both situation.appreciate if you can help me to sort this.. thank you in advance...

3 Answers3

2

This happens because the latter file has different format.

When examined carefully, one notices there are two spaces between 4 and HPUX_PROD strings:

Id Name       Members             
4  HPUX_PROD HPUX_PROD.3
^^^^

On the first file, there is a single space between number and string:

 Id Name                      Members                      
122 RCP_VMWARE-DMZ-NONPROD    DMZ_NPROD01_111 
  ^^^

As how to fix the issue depends if you need to match both file formats, or if the other has simply a typing error.

vonPryz
  • 22,996
  • 7
  • 54
  • 65
1

Since this sort-of looks like csv output with spaces as delimiter (but not quite), I think you could use ConvertFrom-Csv on this:

# read the file as string array, trim each line and filter only the lines that
# when split on 1 or more whitespace characters has more than one field
# then replace the spaces by a comma and treat it as CSV
# return the 'Name' column only
(((Get-Content -Path 'D:\Test\file1.txt').Trim() | 
    Where-Object { @($_ -split '\s+').Count -gt 1 }) -replace '\s+', ',' | 
    ConvertFrom-Csv).Name

Shorter, but because you are only after the Name column, this works too:

((Get-Content -Path 'D:\Test\file2.txt').Trim() -replace '\s+', ',' | ConvertFrom-Csv).Name -ne ''

Output for file1

RCP_VMWARE-DMZ-NONPROD
RCP_VMWARE-DMZ-PROD
RCP_VMWARE-DMZ-INT.r87351
RCP_VMWARE-LAN-NONPROD

Output for file2

HPUX_PROD
Theo
  • 57,719
  • 8
  • 24
  • 41
1

The existing answers are helpful, but let me try to break it down conceptually:

  • .Split(" ") splits the input string by each individual space character, whereas what you're looking for is to split by runs of (one or more) spaces, given that your column values can be separated by more than one space.

    • For instance 'a b'.split(' ') results in 3 array elements - 'a', '', 'b' - because the empty string between the two spaces is considered an element too.
  • The .NET [string] type's .Split() method is based on verbatim strings or character sets and therefore doesn't allow you to express the concept of "one ore more spaces" as a split criterion, whereas PowerShell's regex-based -split operator does.

    • Conveniently, -split's unary form (see below) has this logic built in: it splits each input string by any nonempty run of whitespace, while also ignoring leading and trailing whitespace, which in your case obviates the need for a regex altogether.

    • This answer compares and contrasts the -split operator with string type's .Split() method, and makes the case for routinely using the former.

Therefore, a working solution (for both input files) is:

Get-Content .\file2.txt | Select-Object -Skip 1 |
  Foreach-Object { if ($value = (-split $_)[1]) { $value } }

Note:

  • If the column of interest contains a value (at least one non-whitespace character), so must all preceding columns in order for the approach to work. Also, column values themselves must not have embedded whitespace (which is true for your sample input).

  • The if conditional both extracts the 2nd column value ((-split $_)[1]) and assigns it to a variable ($value = ), whose value then implicitly serves as a Boolean:

    • Any nonempty string is implicitly $true, in which case the extracted value is output in the associated block ({ $value }); conversely, an empty string results in no output.

    • For a general overview of PowerShell's implicit to-Boolean conversions, see this bottom section of this answer.

mklement0
  • 382,024
  • 64
  • 607
  • 775