0

I'm pulling out a date from a file using the below code. The value looks like it is a string, but when you use the ParseExact (see code with comment) then it fails. If I create a variable of the same value and do the same ParseExact it then works. So I'm trying to debug the issue and it appears that both values are the same, but they can't be as one works while the other doesn't.

    function getDateFromFile($file) {
    $shellObject = New-Object -ComObject Shell.Application
    $directoryObject = $shellObject.NameSpace( $file.Directory.FullName )
    $fileObject = $directoryObject.ParseName( $file.Name )

    $property = 'Date taken'
    for(
       $index = 5;
       $directoryObject.GetDetailsOf( $directoryObject.Items, $index ) -ne $property;
       ++$index ) { }

    $value = $directoryObject.GetDetailsOf( $fileObject, $index )
    $format= "dd/MM/yyyy H:mm";

    # $value when debugging appears to be -> "20/04/2021 14:04"
    #The below fails -> Exception calling "ParseExact" with "3" argument(s): "String was not recognized as a valid DateTime."
    $date01=[System.DateTime]::ParseExact($value,$format, $null)

    # The below works fine and $date02 becomes ->  20 April 2021 14:04:00
    $tmp = "20/04/2021 14:04"   
    $date02=[System.DateTime]::ParseExact($tmp,$format, $null)
    
    return $date1=[System.DateTime]::ParseExact($tmp,$format, $null)

}

I've added comments into the above snippet. So how do you debug powershell to be able to tell what's wrong (other than standard breakpoints with powershell LSE)? If you do know what's wrong, as well as pointing out how to fix it, it would be good to know how you know.

Update - This is a similar question How can I get programmatic access to the "Date taken" field of an image or video using powershell? as I had already used some of the code from an answer there. But as mentioned below in the comments and in this Q the returned string isn't usable and I needed to know how to debug it.

delp
  • 771
  • 11
  • 20
  • what is the type of `$value`? what does `$value | Get-Member` return? maybe you have to convert it to a string first? – Guenther Schmitz Apr 23 '21 at 08:46
  • Does this answer your question? [How can I get programmatic access to the "Date taken" field of an image or video using powershell?](https://stackoverflow.com/questions/6834259/how-can-i-get-programmatic-access-to-the-date-taken-field-of-an-image-or-video) – Guenther Schmitz Apr 23 '21 at 08:48
  • There might be a better way to achieve your overall goal, but for your specific question about the strings, you can try dumping the hex values of each character in both strings (or even just their lengths) and see how they compare - you might have a non-printing character embedded somewhere... ```write-host( $tmp.ToCharArray() | % { [int] $_ })``` gives ```50 48 47 48 52 47 50 48 50 49 32 49 52 58 48 52``` on my machine - try that with ```$value``` as well and see if you get the same result. – mclayton Apr 23 '21 at 08:54
  • Also, try ```write-host $value.GetType().FullName``` - it might be serializing to a literal string like ```Shell.DateValue``` (or whatever) when the code is running, but displaying a nicely formatted date when debugging. – mclayton Apr 23 '21 at 08:58
  • @GuentherSchmitz If I do the Get-Member it tells me it's a TypeName: System.String, this is the same as the Get-Member for $tmp. Yes that link was helpful, that's where I got the above code from :) – delp Apr 23 '21 at 09:28
  • @mclayton That's great. Yes if I do that then it shows me different values. So the $value is -> 8206 50 48 47 8206 48 52 47 8206 50 48 50 49 32 8207 8206 49 52 58 48 52 and $tmp is -> 50 48 47 48 52 47 50 48 50 49 32 49 52 58 48 52 – delp Apr 23 '21 at 09:29
  • The extra unicodes are 8206 and 8207 (which is a marker for left to right and right to left). Now why are they inserted, I'm reading up on. – delp Apr 23 '21 at 09:34
  • As aside, I believe the format should be `"dd/MM/yyyy HH:mm"` and you need to `Trim()` the value of `$date01` before parsing (may have invisible characters appended) – Theo Apr 23 '21 at 09:35
  • @delp - the return value from ```GetDetailsOf``` is a BSTR string, which contains formatting instructions (the LTR and RTL points) for locale-specific display. See this (google-cached) question on the msdn q&a site: http://webcache.googleusercontent.com/search?q=cache:6K2tCvPN1QsJ:https://social.msdn.microsoft.com/Forums/windows/en-US/77554c16-3050-448d-8455-b0aab3f6b2f4/datetaken-extended-file-attribute-returned-from-getdetailsof-method-has-embedded-whitespace?forum%3Dvbgeneral&hl=en&gl=uk&strip=1&vwsrc=0 (quote to follow)... – mclayton Apr 23 '21 at 09:56
  • 2
    ... "The characters that you are seeing (along with some others, such as nulls) are embedded in BSTRs to allow the calling function to correctly display the string for any locale. It includes such things as the left to right marker that you saw, so that the calling application knows that the characters must be grouped and displayed left to right. To convert from a BSTR to a different type of string, see: https://msdn.microsoft.com/en-us/library/ms235631.aspx". I can't find a *simple* way to convert it in PowerShell, so maybe just replace those special characters with an empty string... – mclayton Apr 23 '21 at 09:56
  • Thanks @mclayton. I've done some testing and this is messy (and bit like poc code) but works. $formatedDateString = $value -replace '[^\p{L}\p{Nd}\:\/\ ]', '' When I pass that into the ParseExact, it then works as expected. – delp Apr 23 '21 at 10:23
  • Great sleuthing, @mclayton, but that these control characters appear _multiple times_ within a given string is still mystifying (and BSTRs don't need conversion in .NET - these control chars. are a legitimate part of the _content_ of the string) - see https://stackoverflow.com/a/52597482/45375 – mklement0 Nov 05 '21 at 21:55

1 Answers1

1

Merging together the answers from the comments. @mclayton has advised how to debug the strings using the following -

write-host( $tmp.ToCharArray() | % { [int] $_ })

That enabled me to see the extra non visible chars like u8206 and u8207, which as mentioned are embedded in a BSTR.

Knowing what the extra chars are I can use an expression to pull those out from the string before passing to the ParseDate.

$formattedDateString = $value -replace '[^\p{L}\p{Nd}\:\/\ ]', ''
delp
  • 771
  • 11
  • 20
  • Turns out that even this original of [your duplicate](https://stackoverflow.com/q/69857468/45375) was itself a duplicate: `-replace '\p{Cf}'` should do - see [this answer](https://stackoverflow.com/a/52597482/45375). – mklement0 Nov 05 '21 at 21:57