0

I need some help splitting a column into multiple columns in the CSV file using powershell.

These are outlook task completion emails, and we are gathering task statistics. I need to extract the date completed and actual work data from the body column.

Here is one entry for the csv file

Subject,Body,From: (Name)
Task Completed: lprab: 160323-092321 - PCMS:Review/SGCE:Révision,"

Jon York


-----Original Task-----
Subject: lprab: 160323-092321 - PCMS:Review/SGCE:Révision
Priority: Normal

Start date: Wed 2016-03-23
Due date: Wed 2016-03-23

Status: Completed
% Complete: 100%
Date completed: Wed 2016-03-23
Actual work: 15 minutes

Requested by: Internet Content-PAB / Contenu d'Internet-DGAP

------------

","York, Jonathan"

So far, I've managed to split the body at the date completed part using this code here

ForEach-Object {

    $_.Body,$tempDateCompleted=$_.Body -split "Date completed: ",2
    $_ | Select-Object *,@{Name="DateCompleted";Expression={$tempDateCompleted}}

} #| export-csv

But I can't then get the actual date afterwards because doing

ForEach-Object {

    $_.Body,$tempDateCompleted=$_.Body -split "Date completed: ",2
    $_ | Select-Object *,@{Name="DateCompleted";Expression={$tempDateCompleted}}

    $_.DateCompleted,$tempActualWork=$_.Body -split "Actual work: ",2
    $_ | Select-Object *,@{Name="ActualWork";Expression={$tempActualWork}}


} #| export-csv

give me this error

Property 'DateCompleted' cannot be found on this object; make sure it exists and is settable.
At line:82 char:8
+     $_. <<<< DateCompleted,$tempActualWork=$_.Body -split "Actual work: ",2
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : PropertyAssignmentException

My ultimate goal would be to have the following output

Subject, From, DateCompleted, ActualWork "Task name","Jon York","2016-03-23","15 minutes"

Thanks!

JonYork
  • 1,223
  • 8
  • 31
  • 52
  • So every entry starts and stops with `Subject,Body,From: (Name)`. This is not a CSV at all from the looks of what you posted. Ahh Ok Is the body a multi-lined field then which is why it looks like it does. – Matt Apr 08 '16 at 13:43
  • Would a bunch of regular expressions work better? – JonYork Apr 08 '16 at 14:00
  • Sort of.... Just wait I got this. Just finishing my test and answer. – Matt Apr 08 '16 at 14:05

3 Answers3

1

I would love to introduce you to ConvertFrom-StringData that will convert that body into a hash table that we then will convert into a proper object for you to output.

 $source = Import-Csv C:\temp\text.csv
 $data = Foreach($row in $source){
    $newHash =  $row.Body -split "`r`n" | Where-Object{$_ -match ":"} | Foreach-object{
        $_ -replace "^(.*?):",'$1='
    } | Out-String |ConvertFrom-StringData

    $newHash.Subject = $row.Subject
    $newHash.From = $row.'From: (Name)'

    New-Object -TypeName pscustomobject -Property $newHash
 } 

This takes the $_.Body on turns it into a string array split on newlines. Then we filter out any entry that does not have a ":". For each of those entries we replace the first colon with an equals sign (as required by ConvertFrom-StringData). This works well so you don't have to build a bunch of regex patterns to match each property you want thereby making it easier to add and remove properties in your output. We build on that hashtable with the other 2 entries from the original row.

Now $data contains the complex object containing all the details of the entry.

% Complete     : 100%
Date completed : Wed 2016-03-23
Start date     : Wed 2016-03-23
From           : York, Jonathan
Priority       : Normal
Status         : Completed
Subject        : Task Completed: lprab: 160323-092321 - PCMS:Review/SGCE:Révision
Actual work    : 15 minutes
Due date       : Wed 2016-03-23
Requested by   : Internet Content-PAB / Contenu d'Internet-DGAP

Now you just need to do stuff with it like your export. Since some of the fields have spaces we need to quote those properties.

$data | select Subject,"Date completed","Actual work",From | Export-CSV -NoTypeInformation $path
Matt
  • 45,022
  • 8
  • 78
  • 119
  • So if I then wanted to output a CSV with just Subject, Date completed, Actual Work and From, what would that bit of code look like? – JonYork Apr 08 '16 at 14:20
  • @JonYork Added a little more on the bottom to help with that. – Matt Apr 08 '16 at 14:41
0

_If you want to pull out the date completed you can use a RegEx pattern, something like this:

$_.body -match 'Date completed:.*\n' | Out-Null;$matches[0] -replace "Date completed: ",""

Keep in mind that when you use -match the results are stored in a variable $matches. In this example, you'd get (for that line): Wed 2016-03-23

Then you can assign that wherever you want. It only selects the line with Date completed: up until the line break, and then you use string formatting tools to remove the Date completed bit. You could use similar logic for your other lines and assign them as needed.

Edit: You can also use a lookbehind to avoid using the string formatting, it just hurts my head because I don't use it too often:

$_.body -match '(?<=Date completed: ).*\n' | Out-Null;$matches[0]

Edit 2:

Since you already are using a foreach, you could do something like this:

#Create a collection you can easily add to
$Results = New-Object System.Collections.ArrayList

#...read the file

ForEach-Object {

    $obj = @{
        Subject = $e.Subject
        From = $_.'From: (Name)'
    }

    if ($_.Body -match '(?<=Actual work: ).*\n') {$obj["Actual Work"] = $matches[0]} 
      else {$obj["Actual Work"] = $null}

    if ($_.Body -match '(?<=Date Completed: ).*\n') {$obj["Date Completed"] = $matches[0]}
      else {$obj["Date Completed"] = $null}

    $Results.Add($Obj) | Out-Null

}

$Results | Export-CSV

Your result objects would look something like this:

Name              Value                                                                                        
----              -----                                                                                        
Subject           Task Completed: l...                             
From              York, Jonathan                                                                               
Date Completed    Wed 2016-03-23...                                                                            
Actual Work       15 minutes...   

If you were to go about it this way, use an ArrayList so that you can add to the collection without having to completely rebuild it each time (because your standard @() collections are a fixed size and have to be copied to a new collection of size+1 each time you do a += type add) and pipe your call to add to the collection to Out-Null to remove the output (the index you just added the item to).

When you do the -match you get a boolean true or false response. If you wrap it in the if/else statement like this, you can handle what happens if you don't find a result - in this case just leaving the value empty.

Matt's answer is also a great way to go about it too!

Community
  • 1
  • 1
squid808
  • 1,430
  • 2
  • 14
  • 31
0

Based on the info everyone gave me, it wasn't working nearly as well as I was hoping. It may be a bit of a long winded solution, but it works well, and predictably.

I needed to pipe the imported content as an out-string to get multiple regular expressions to work as expected, and to pipe the string back out as a csv file again, I needed to encode it as Ascii.

$text = (Get-Content c:\Tools\export.csv) | out-string

$regex = "(?sm),""\s.*?Date completed: "
$replace = ',"'
$output = [regex]::replace($text,$regex,$replace)

$regex = "[\r\n]+Actual work: "
$replace = '","'
$output = [regex]::replace($output,$regex,$replace)

$regex = "(?sm)Requested by:\s.*?"","""
$replace = ',"'
$output = [regex]::replace($output,$regex,$replace)

$regex = '"Mon '
$replace = '"'
$output = [regex]::replace($output,$regex,$replace)

$regex = '"Tue '
$replace = '"'
$output = [regex]::replace($output,$regex,$replace)

$regex = '"Wed '
$replace = '"'
$output = [regex]::replace($output,$regex,$replace)

$regex = '"Thu '
$replace = '"'
$output = [regex]::replace($output,$regex,$replace)

$regex = '"Fri '
$replace = '"'
$output = [regex]::replace($output,$regex,$replace)

$regex = "(?sm) minute(.)\s.*?,"""
$replace = '","minute","'
$output = [regex]::replace($output,$regex,$replace)

$regex = "(?sm) hour(.)\s.*?,"""
$replace = '","hour","'
$output = [regex]::replace($output,$regex,$replace)

$regex = "(?sm) day(.)\s.*?,"""
$replace = '","day","'
$output = [regex]::replace($output,$regex,$replace)

$regex = "(?sm) week(.)\s.*?,"""
$replace = '","week","'
$output = [regex]::replace($output,$regex,$replace)

$regex = "(?sm),""\/O=CO\S.*?"",,""Normal"""
$replace = ''
$output = [regex]::replace($output,$regex,$replace)

$regex = "(?sm)""Task Declined\S.*?""Task"
$replace = '"Task'
$output = [regex]::replace($output,$regex,$replace)

$regex = "(?sm)""Subject\S.*?Sensitivity"""
$replace = '"Subject","Date Completed","Time","Factor","Developper"'
$output = [regex]::replace($output,$regex,$replace)

$output | Out-file c:\Tools\output.csv -Encoding ascii

Now the sample entry in the csv file looks like this

"Subject","Date Completed","Time","Factor","Developper"
"Task Completed: lprab: 160323-092321 - PCMS:Review/SGCE:R?vision","2016-03-23","15","minute","York, Jonathan"
JonYork
  • 1,223
  • 8
  • 31
  • 52