1

I created a tool (to be precise: a Powershell script) that helps me with converting pictures in folders, i.e. it looks for all files of a certain ending (say, *.TIF) and converts them to JPEGs via ImageMagick. It then transfers some EXIF, IPTC and XMP information from the source image to the JPEG via exiftool:

# searching files (done before converting the files, so just listed for reproduction):
$WorkingFiles = @(Get-ChildItem -Path D:\MyPictures\Testfiles -Filter *.tif | ForEach-Object {
    [PSCustomObject]@{
        SourceFullName = $_.FullName
        JPEGFullName = $_.FullName -Replace 'tif$','jpg'
    }
})
# Then, converting is done. PowerShell will wait until every jpeg is successfully created.
# + + + + The problem occurs somewhere after this line + + + +
# Creating the exiftool process:
$psi = New-Object System.Diagnostics.ProcessStartInfo
$psi.FileName = .\exiftool.exe
$psi.Arguments = "-stay_open True -charset utf8 -@ -"
$psi.UseShellExecute = $false
$psi.RedirectStandardInput = $true
$psi.RedirectStandardOutput = $true
$psi.RedirectStandardError = $true
$exiftoolproc = [System.Diagnostics.Process]::Start($psi)

# creating the string argument for every file, then pass it over to exiftool:
for($i=0; $i -lt $WorkingFiles.length; $i++){
    [string]$ArgList = "-All:all=`n-charset`nfilename=utf8`n-tagsFromFile`n$($WorkingFiles[$i].SourceFullName)`n-EXIF:All`n-charset`nfilename=utf8`n$($WorkingFiles[$i].JPEGFullName)"
    # using -overwrite_original makes no difference
    # Also, just as good as above code:
    # [string]$ArgList = "-All:All=`n-EXIF:XResolution=300`n-EXIF:YResolution=300`n-charset`nfilename=utf8`n-overwrite_original`n$($WorkingFiles[$i].JPEGFullName)"

    $exiftoolproc.StandardInput.WriteLine("$ArgList`n-execute`n")
    # no difference using start-sleep:
    # Start-Sleep -Milliseconds 25
}
# close exiftool:
$exiftoolproc.StandardInput.WriteLine("-stay_open`nFalse`n")

# read StandardError and StandardOutput of exiftool, then print it:
[array]$outputerror = @($exiftoolproc.StandardError.ReadToEnd().Split("`r`n",[System.StringSplitOptions]::RemoveEmptyEntries))
[string]$outputout = $exiftoolproc.StandardOutput.ReadToEnd()
$outputout = $outputout -replace '========\ ','' -replace '\[1/1]','' -replace '\ \r\n\ \ \ \ '," - " -replace '{ready}\r\n',''
[array]$outputout = @($outputout.Split("`r`n",[System.StringSplitOptions]::RemoveEmptyEntries))

Write-Output "Errors:"
foreach($i in $outputerror){
    Write-Output $i
}
Write-Output "Standard output:"
foreach($i in $outputout){
    Write-Output $i
}

If you want to reproduce but do not have/want that many files, there is also a simpler way: let exiftool print out its version number 600 times:

$psi = New-Object System.Diagnostics.ProcessStartInfo
$psi.FileName = .\exiftool.exe
$psi.Arguments = "-stay_open True -charset utf8 -@ -"
$psi.UseShellExecute = $false
$psi.RedirectStandardInput = $true
$psi.RedirectStandardOutput = $true
$psi.RedirectStandardError = $true
$exiftoolproc = [System.Diagnostics.Process]::Start($psi)

for($i=0; $i -lt 600; $i++){
    try{
        $exiftoolproc.StandardInput.WriteLine("-ver`n-execute`n")
        Write-Output "Success:`t$i"
    }catch{
        Write-Output "Failed:`t$i"
    }
}
# close exiftool:
try{
    $exiftoolproc.StandardInput.WriteLine("-stay_open`nFalse`n")
}catch{
    Write-Output "Could not close exiftool!"
}

[array]$outputerror = @($exiftoolproc.StandardError.ReadToEnd().Split("`r`n",[System.StringSplitOptions]::RemoveEmptyEntries))
[array]$outputout = @($exiftoolproc.StandardOutput.ReadToEnd().Split("`r`n",[System.StringSplitOptions]::RemoveEmptyEntries))

Write-Output "Errors:"
foreach($i in $outputerror){
    Write-Output $i
}
Write-Output "Standard output:"
foreach($i in $outputout){
    Write-Output $i
}

As far as I could test, it all goes well, as long as you stay < 115 files. If you go above, the 114th JPEG gets proper metadata, but exiftool stops to work after this one - it idles, and my script does so, too. I can reproduce this with different files, paths, and exiftool commands.

Neither the StandardOutput nor the StandardError show any irregularities even with exiftool's -verbose-flag - of course, they would not, as I have to kill exiftool to get them to show up.

Running ISE's / VSCode's debugger shows nothing. Exiftool's window (only showing up when debugging) shows nothing.

Is there some hard limit on commands run with System.Diagnostics.Process, is this a problem with exiftool or is this simply due to my incompetence to use something outside the most basic Powershell cmdlets? Or maybe the better question would be: How can I properly debug this?


Powershell is 5.1, exiftool is 10.80 (production) - 10.94 (latest).

flolilo
  • 194
  • 1
  • 12
  • 1
    My best advice is to write a very short sample script that contains only the minimum amount of code needed to reproduce the problem. – Bill_Stewart May 03 '18 at 19:48
  • I would add to what Bill_Stewart suggested. Write a script to simply edit image via exiftool only. See if that fails after some number of images? If not, then the issue is elsewhere in your code. Perhaps you are running out of space in your ImageMagick tmp directory. – fmw42 May 03 '18 at 19:55
  • @fmw42 I doubt that I'm running out of space - the thumbs are 100kiB in size and I have and NTFS-formatted HDD with around 900 GiB of free space. ImageMagick works flawless (as per File Explorer and IrfanView and Photoshop) and is already done when exiftool kicks in. – flolilo May 03 '18 at 20:14
  • @Bill_Stewart I reduced and tried it without `-tagsfromfile`- still, it is the same. If somebody wants to know: I parsed the commands into the console before handing them over to exiftool - they all look equally correct. – flolilo May 03 '18 at 20:23
  • I recommend using the debugger in the ISE and embark on a debugging expedition. – Bill_Stewart May 03 '18 at 20:51
  • You're only running a single command, writing to stdin isn't launching more programs; Running `c:\windows\system32\more.com` instead of exiftool, writing to stdin, and reading from stdout, I got a loop of 50,000 write/reads but it did start going weird after that - like it was locking up rather than slowing down. – TessellatingHeckler May 04 '18 at 02:53
  • @TessellatingHeckler I know that I send commands to the same exiftool-instance - I'm trying to reduce overhead this way, as exiftool needs much longer to start than to work. So you have reproduced the problem with different code? – flolilo May 04 '18 at 12:32
  • @Bill_Stewart ISE's debugger shows **nothing**. It hangs in the same way as ever, without any information as to why that is. Added that information in my question. – flolilo May 04 '18 at 14:43
  • It sounds like you will need to run your tool in batches then. – Bill_Stewart May 04 '18 at 15:04
  • If anyone still cares/has any ideas, I think I found the simplest way to get a reproducible failure: let exiftool print its version number XYZ (> 500) times. Added the code in my question. – flolilo May 04 '18 at 15:14
  • 1
    Then I would go ahead and redesign your script to operate in batches, and post your answer once you're done. – Bill_Stewart May 04 '18 at 16:49

1 Answers1

1

After messing around with different variants of $ArgList, I found out that there is no difference when using different file commands, but using commands that produce less StdOut (like -ver) resulted in more iterations. Therefore, I took an educated guess that the output buffer is the culprit.

As per Mark Byers' answer to "ProcessStartInfo hanging on “WaitForExit”? Why?":

The problem is that if you redirect StandardOutput and/or StandardError the internal buffer can become full. [...]

The solution is to use asynchronous reads to ensure that the buffer doesn't get full.

Then, it was just a matter of searching for the right things. I found that Alexander Obersht's answer to "How to capture process output asynchronously in powershell?" provides almost everything that I needed.

The script now looks like this:

# searching files (done before converting the files, so just listed for reproduction):
$WorkingFiles = @(Get-ChildItem -Path D:\MyPictures\Testfiles -Filter *.tif | ForEach-Object {
    [PSCustomObject]@{
        SourceFullName = $_.FullName
        JPEGFullName = $_.FullName -Replace 'tif$','jpg'
    }
})
# Then, converting is done. PowerShell will wait until every jpeg is successfully created.
# Creating the exiftool process:
$psi = New-Object System.Diagnostics.ProcessStartInfo
$psi.FileName = .\exiftool.exe
$psi.Arguments = "-stay_open True -charset utf8 -@ -"
$psi.UseShellExecute = $false
$psi.RedirectStandardInput = $true
$psi.RedirectStandardOutput = $true
$psi.RedirectStandardError = $true

# + + + + NEW STUFF (1/2) HERE: + + + +
# Creating process object.
$exiftoolproc = New-Object -TypeName System.Diagnostics.Process
$exiftoolproc.StartInfo = $psi

# Creating string builders to store stdout and stderr.
$exiftoolStdOutBuilder = New-Object -TypeName System.Text.StringBuilder
$exiftoolStdErrBuilder = New-Object -TypeName System.Text.StringBuilder
# Adding event handers for stdout and stderr.
$exiftoolScripBlock = {
    if (-not [String]::IsNullOrEmpty($EventArgs.Data)){
        $Event.MessageData.AppendLine($EventArgs.Data)
    }
}
$exiftoolStdOutEvent = Register-ObjectEvent -InputObject $exiftoolproc -Action $exiftoolScripBlock -EventName 'OutputDataReceived' -MessageData $exiftoolStdOutBuilder
$exiftoolStdErrEvent = Register-ObjectEvent -InputObject $exiftoolproc -Action $exiftoolScripBlock -EventName 'ErrorDataReceived' -MessageData $exiftoolStdErrBuilder

[Void]$exiftoolproc.Start()
$exiftoolproc.BeginOutputReadLine()
$exiftoolproc.BeginErrorReadLine()
# + + + + END OF NEW STUFF (1/2) + + + +

# creating the string argument for every file, then pass it over to exiftool:
for($i=0; $i -lt $WorkingFiles.length; $i++){
    [string]$ArgList = "-All:all=`n-charset`nfilename=utf8`n-tagsFromFile`n$($WorkingFiles[$i].SourceFullName)`n-EXIF:All`n-charset`nfilename=utf8`n$($WorkingFiles[$i].JPEGFullName)"
    # using -overwrite_original makes no difference
    # Also, just as good as above code:
    # [string]$ArgList = "-All:All=`n-EXIF:XResolution=300`n-EXIF:YResolution=300`n-charset`nfilename=utf8`n-overwrite_original`n$($WorkingFiles[$i].JPEGFullName)"

    $exiftoolproc.StandardInput.WriteLine("$ArgList`n-execute`n")
}

# + + + + NEW STUFF (2/2) HERE: + + + +
# close exiftool:
$exiftoolproc.StandardInput.WriteLine("-stay_open`nFalse`n")
$exiftoolproc.WaitForExit()
# Unregistering events to retrieve process output.
Unregister-Event -SourceIdentifier $exiftoolStdOutEvent.Name
Unregister-Event -SourceIdentifier $exiftoolStdErrEvent.Name

# read StandardError and StandardOutput of exiftool, then print it:
[array]$outputerror = @($exiftoolStdErrBuilder.ToString().Trim().Split("`r`n",[System.StringSplitOptions]::RemoveEmptyEntries))
[string]$outputout = $exiftoolStdOutBuilder.ToString().Trim() -replace '========\ ','' -replace '\[1/1]','' -replace '\ \r\n\ \ \ \ '," - " -replace '{ready}\r\n',''
[array]$outputout = @($outputout.Split("`r`n",[System.StringSplitOptions]::RemoveEmptyEntries))
# + + + + END OF NEW STUFF (2/2) + + + +

Write-Output "Errors:"
foreach($i in $outputerror){
    Write-Output $i
}
Write-Output "Standard output:"
foreach($i in $outputout){
    Write-Output $i
}

I can confirm that it works for many, many files (at least 1600).

flolilo
  • 194
  • 1
  • 12