2

Don't know really how to go about this? I can convert one tif to one pdf. I can convert all tifs in one directory into one pdf. What I want to do is convert a group of tifs based on their lastwriteaccess or createddate or modifieddate.

For example, if I have 7 tifs in one directory where 3 have the same timestamp and 4 have another same timestamp, I want to merge the 3 into one pdf then merge the other 4 into another pdf. I'm kind of stuck on how to approach this. Do I need to create list of all the files then group them or can I merge 3 then go the next group merge those etc, etc, etc using a for each?

The code below is what I'm using to collect the first 5 files:

Dim dir As New DirectoryInfo(tiffPath)
Dim files As List(Of FileInfo) = 
dir.GetFiles("*.tif").OrderByDescending(Function(fc) 
fc.LastAccessTime).Take(5).ToList

For Each lfi As FileInfo In files
MsgBox(lfi.Name)
Next
Andrew Morton
  • 24,203
  • 9
  • 60
  • 84
Smiles
  • 71
  • 1
  • 1
  • 5
  • I have removed the iText tag because, as you write yourself, you already have the iText part of your question covered. To help you I added some tags that seemed more relevant. – Amedee Van Gasse Aug 16 '17 at 17:36
  • How exactly matched are the timestamps for each group of files that you would consider to be the same? Could they be within one minute of each other, within an hour, a second...? – Andrew Morton Aug 16 '17 at 18:19
  • the date and time stamp are exactly the same to the second, don't know how it does but another program provides the tiff for us – Smiles Aug 16 '17 at 19:00

1 Answers1

0

It looks like it would be sufficient to bunch files together if their timestamps differ by less than some timespan.

So, if you order the files by their .LastWriteTimeUtc then you can iterate over that list and check how long it was between one and the previous one. If the gap is small then add it to the current list, otherwise start a new list.

I tested the following code on a directory with a random selection of files, so 30 days was an appropriate timespan for that, it looks like maybe two or three seconds would be good for your use:

Option Infer On
Option Strict On

Imports System.IO

Module Module1

    ''' <summary>
    ''' Get FileInfos bunched by virtue of having less than some time interval between their consecutive LastWriteTimeUtc when ordered by that.
    ''' </summary>
    ''' <param name="srcDir">Directory to get files from.</param>
    ''' <param name="adjacencyLimit">The allowable timespan to count as in the same bunch.</param>
    ''' <returns>A List(Of List(Of FileInfo). Each outer list has consecutive LastWriteTimeUtc differences less than some time interval.</returns>
    Function GetTimeAdjacentFiles(srcDir As String, adjacencyLimit As TimeSpan) As List(Of List(Of FileInfo))
        Dim di = New DirectoryInfo(srcDir)
        Dim fis = di.GetFiles().OrderBy(Function(fi) fi.LastWriteTimeUtc)

        If fis.Count = 0 Then
            Return Nothing
        End If

        Dim bins As New List(Of List(Of FileInfo))
        Dim thisBin As New List(Of FileInfo) From {(fis(0))}

        For i = 1 To fis.Count - 1
            If fis(i).LastWriteTimeUtc - fis(i - 1).LastWriteTimeUtc < adjacencyLimit Then
                thisBin.Add(fis(i))
            Else
                bins.Add(thisBin)
                thisBin = New List(Of FileInfo) From {fis(i)}
            End If
        Next

        bins.Add(thisBin)

        Return bins

    End Function

    Sub Main()
        Dim src = "E:\temp"
        'TODO: choose a suitable TimeSpan, e.g. TimeSpan.FromSeconds(3)
        Dim adjacencyLimit = TimeSpan.FromDays(30)
        Dim x = GetTimeAdjacentFiles(src, adjacencyLimit)

        For Each b In x
            Console.WriteLine("***********")
            For Each fi In b
                'TODO: merge each fi into a PDF.
                Console.WriteLine(fi.Name)
            Next
        Next

        Console.ReadLine()

    End Sub

End Module

I suggest two or three seconds because if the files have been stored on a FAT-type (e.g. FAT32 or exFAT, as can be used on USB memory sticks, old disk drives, and such) filesystem then the resolution of the timestamp will have been two seconds.

Andrew Morton
  • 24,203
  • 9
  • 60
  • 84
  • this worked for my thank you Andrew. I marked answer accepted but i do not have enough reputation points – Smiles Aug 17 '17 at 17:06
  • @ElNottoWorry You're welcome :) I think you *might* need to wait some time (maybe 24 or 48 hours) before you can accept an answer - until you get sufficient rep. – Andrew Morton Aug 17 '17 at 17:27