5

I have a data file of about 10-15 columns from which I want to extract specific columns. Some of the columns I know the exact column header and others I only know that the first two letters will always be "FC". How do I select only the columns where I know the column header and those that start with "FC"? Starting with just the "FC" columns, I have tried like this:

$myCSV = Import-CSV "mydata.txt" -Delimiter "`t"
$FCcols = $myCSV[0].psobject.Properties | foreach { $_.Name } | Where {$_ -match "FC"}
$myCSV | select $FCcols

But I just get an error:

Select-Object : Cannot convert System.Management.Automation.PSObject to one of 
the following types {System.String, System.Management.Automation.ScriptBlock}.
At line:3 char:16
+ $myCSV | select <<<<  $FCcols
    + CategoryInfo          : InvalidArgument: (:) [Select-Object], NotSupport 
   edException
    + FullyQualifiedErrorId : DictionaryKeyUnknownType,Microsoft.PowerShell.Co 
   mmands.SelectObjectCommand

Then, if I try:

$myCSV = Import-CSV "mydata.txt" -Delimiter "`t"
$FCcols = [System.Collections.ArrayList]@()
$myCSV[0].psobject.Properties | foreach { $_.Name } | Where {$_ -match "FC"} | %{$FCcols.Add($_)}
$myCSV | select $FCcols

I get the output I want except that it is in "column header : value" format, like this:

FC1839 : 0
FC1842 : 1
FC1843 : 6
FC1844 : 12
FC1845 : 4

FC1839 : 0
FC1842 : 0
FC1843 : 19
FC1844 : 22
FC1845 : 14

I am probably just missing something simple, but how do I get to the point that I am able to select these matching columns and then output them to another .txt file (without the header : value format)?

G42
  • 9,791
  • 2
  • 19
  • 34
Tero Isotalo
  • 53
  • 1
  • 1
  • 6
  • 3
    Why not just `$myCSV |Select *FC*`? – Mathias R. Jessen Jul 26 '17 at 13:56
  • What you have would work if you use $myCSV | select $FCcols.Name but Mathias is right... – bunzab Jul 26 '17 at 14:01
  • @bunzab: Indeed, Mathias' approach is the best solution. `$FCcols.Name` wouldn't work, because `$FCcols` is an array of _strings_ (no `.Name` property). In fact, the code works as-is in PSv3+ and only fails in PSv2, due to what I presume is a bug. – mklement0 Jul 26 '17 at 14:21

4 Answers4

4

First things first: Mathias R. Jessen's helpful tip not only solves your problem, but significantly simplifies the approach (and also works in PSv2):

$myCSV | Select-Object FC*

The (implied) -Property parameter supports wildcard expressions, so FC* matches all property (column names) that start with FC.

As for the output format you're seeing: Because you're selecting 5 properties, PowerShell defaults to implicit Format-List formatting, with each property name-value pair on its own line.

To fix this display problem, pipe to Format-Table explicitly (which is what PowerShell would do implicitly if you had selected 4 or fewer properties):

$myCSV | Select-Object FC* | Format-Table

To re-export the results to a CSV (TSV) file:

Import-Csv mydata.txt -Delimiter "`t" | Select-Object FC* | 
  Export-Csv myresults.txt -Encoding Utf8 -Delimiter "`t" -NoTypeInformation

To do so without a header line:

Import-Csv mydata.txt -Delimiter "`t" | Select-Object FC* | 
  ConvertTo-Csv -Delimiter "`t" -NoTypeInformation | Select-Object -Skip 1 |
    Set-Content myresults.txt -Encoding Utf8

As for your specific symptom:

The problem occurs only in PSv2, and it smells like a bug to me.

The workaround is make your column-name array a strongly typed string array ([string[]]):

[string[]] $FCcols = $myCSV[0].psobject.Properties | % { $_.Name } | ? { $_ -match '^FC' }

Note that, for brevity, I've used built-in alias % in lieu of ForEach-Object and ? in lieu of Where-Object.
Also note that the regex passed to -match was changed to ^FC to ensure that only columns that start with FC are matched.


Your code works as-is in PSv3+, but can be simplified:

$FCcols = $myCSV[0].psobject.Properties.Name -match "^FC"

Note how .Name is applied directly to .psobject.Properties, which in v3+ causes the .Name member to be invoked on each item of the collection, a feature called member-access enumeration.

mklement0
  • 382,024
  • 64
  • 607
  • 775
1

I would use Get-Member to get your columns, something like this:

$myCSV = Import-CSV "mydata.txt" -Delimiter "`t"
$myCSV | select ($myCSV | gm -MemberType NoteProperty | ? {$_.Name -match 'FC'}).Name
Dave Sexton
  • 10,768
  • 3
  • 42
  • 56
  • Thank you for this response. While it didn't fully solve my issue, I have been able to use it elsewhere. A very useful technique. – Tero Isotalo Aug 18 '17 at 06:02
0

Mathias's helpful comment is best way to go for selecting; simple and elegant - dind't know it was an option.

$myCSV | Select *FC*,ColumnIKnowTheNameOf

I believe you need to add Export-Csv to answer your last question. Here's another approach I'd already worked on that makes use of Get-Member and NoteProperty if you need to interrogate csv/similar objects in future.

$myCSV = Import-CSV "mydata.txt" -Delimiter "`t"

# you can get the headings by using Get-Member and Selecting NoteProperty members.
$FCcols = $myCSV |
            Get-Member |
            Where-Object {$_.MemberType -eq "NoteProperty" -and $_.Name -match "FC"} |
            Select-Object -ExpandProperty Name

# you add names to this array.
$FCcols += "ColumnIKnowTheNameOf"

$myCSV | Select-Object $FCcols

# to get a tab-delimited file similar to the one you imported, use Export-Csv
$myCSV | Export-csv "myresults.txt" -Delimiter "`t" -NoTypeInformation
G42
  • 9,791
  • 2
  • 19
  • 34
0

I finally came up with a "quick and dirty" solution which I'm disappointed to not have figured out earlier.

$myCSV = Import-CSV "mydata.txt" -Delimiter "`t" | select FC*
for ($i = 0; $i -lt $myCSV.count; $i++){
$writeline = ($myCSV[$i] | %{$_.PSObject.Properties | %{$_.Value}}) -join "`t"
ac "myresults.txt" $writeline -Encoding utf8}

The first line gives me the columns I want, then the for loop gets the value properties of each column and joins them as tabulated lines, finally each line is appended to a text file. This may not be the pedagogically correct way to achieve the result, but it works so far. Thanks to everyone for their input!

Tero Isotalo
  • 53
  • 1
  • 1
  • 6
  • How about a quick and clean solution that is also memory-efficient? `Import-Csv mydata.txt -Delimiter "\`t" | Select-Object FC* | Export-Csv myresults.txt -Encoding Utf8 -Delimiter "\`t" -NoTypeInformation` (If you really don't want the header line, combine `ConvertTo-Csv` with `Select-Object -Skip 1` and pipe to `Set-Content`). Also, it will be confusing to future readers that the only new thing your answer adds over the existing answers is the solution to a problem you didn't ask to be solved in your question (re-exporting to CSV/TSV). – mklement0 Aug 23 '17 at 13:05
  • The conversation ended up touching on many issues and thus, became potentially confusing, but exporting to a text file was indeed part of the initial goal. The output format was making that export difficult. I chose to include this solution as it solves the original task including the difficulties with formatting along with being able to select by column names beginning with "FC". – Tero Isotalo Aug 24 '17 at 13:12
  • I apologize if selecting this as a solution seems to negate the other input. That is not the intent. The input from Mathias R. Jessen, mklement0 and Dave Sexton were instrumental in my being able to solve this problem. I can change Mathias R Jessen's answer to be the official solution if that seems more appropriate. – Tero Isotalo Aug 24 '17 at 13:12
  • SO benefits from focused questions and matching answers that are of general interest and show best practices. Your answer doesn't explain your original symptom, extracts only the column _values_ using an un-idiomatic solution, and adds code that is incidental to your question, without saying so. (While the question could be changed, doing so after answers have been given can invalidate those, which is why it is discouraged.) I encourage you to accept the answer that best fits these criteria (given that one of the answers is mine, I won't make a recommendation). – mklement0 Aug 24 '17 at 13:51