2

I try to write an Excel VBA script which fetches some information (version and revision date) out of a binary FrameMaker file (*.fm).

Following sub opens the *.fm file and writes the first 25 lines (the informationen needed is into this first 25 lines) into a variable.

Sub fetchDate()
    Dim fso As Object
    Dim fmFile As Object

    Dim fileString As String
    Dim fileName As String
    Dim matchPattern As String
    Dim result As String
    Dim i As Integer
    Dim bufferString As String

    Set fso = CreateObject("Scripting.FileSystemObject")

    fileName = "C:\FrameMaker-file.fm"

    Set fmFile = fso.OpenTextFile(fileName, ForReading, False, TristateFalse)
    matchPattern = "Version - Date.+?(\d{1,2})[\s\S]*Rev.+?(\d{1,2})"

    fileString = ""
    i = 1
    Do While i <= 25
        bufferString = fmFile.ReadLine
        fileString = fileString & bufferString & vbNewLine
        i = i + 1
    Loop
    fmFile.Close

    'fileString = Replace(fileString, matchPattern, "")
    result = regExSearch(fileString, matchPattern)

    MsgBox result

    Set fso = Nothing
    Set fmFile = Nothing
End Sub

The regex functions looks like this:

Function regExSearch(ByVal strInput As String, ByVal strPattern As String) As String
    Dim regEx As New RegExp

    Dim strReplace As String
    Dim result As String
    Dim match As Variant
    Dim matches As Variant
    Dim subMatch As Variant

    Set regEx = CreateObject("VBScript.RegExp")

    If strPattern <> "" Then
        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .Pattern = strPattern
        End With

        If regEx.test(strInput) Then
            Set matches = regEx.Execute(strPattern)

            For Each match In matches
                If match.SubMatches.Count > 0 Then
                    For Each subMatch In match.SubMatches
                        Debug.Print "match:" & subMatch
                    Next subMatch
                End If
            Next match

            regExSearch = result
        Else
            regExSearch = "no match"
        End If
    End If

    Set regEx = Nothing
End Function

Problem 1:

The content of the binary *.fm file which is saved in the variable "fileString" differs in every run, although the *.fm file stays the same.

Here are a few examples of the first three lines from different runs which are saved in "fileString":

example 1

<MakerFile 12.0>


Aaÿ No.009.xxx  ????          /tEXt     ??????

example 2

<MakerFile 12.0>


Aaÿ  `      ? ????          /tEXt ?     c ? E     ? ????a A ? ?      ? ? ? d??????? ?        Heading ????????????A???????A

As you can see example 1 differs from example 2 although it was the very same VBA code and the very same *.fm file.

Problem 2:

It is also a big problem that the regex search string from "matchPattern" is written randomly into my "fileString". Here is a screenshot from the debug console:

parts of value of matchPattern

How can this be? Any suggestions or ideas to fix this problem?

I'm using:

MS Office Professional Plus 2010

VBA reference for regex: Microsoft VBScript Regular Expressions 5.5

Thank you very much in advance!

Regards, Andy

/edit March 12th, 2018:

Here is a sample *.fm file: sample file If you open it with notepad, you can see some information like "Version - DateVersion 4 – 2018/Feb/07" and "Rev02 - 2018/Feb/21" in plain text. I want to fetch this information with a regular expression.

gwc
  • 53
  • 4
  • 3
    FSO and binary files are not friends. Go for a different technique, like ADODB.Streams – Sam Mar 09 '18 at 15:21
  • @Sam That made me chuckle :-) – QHarr Mar 09 '18 at 15:27
  • 2
    Can you upload a sample data (`.fm`) file that we can use to reproduce your problem. I suspect there is a better coding method. In particular, as @Sam implies, FSO is good for ASCII text files; not so good for other types of files and may not be the best choice. – Ron Rosenfeld Mar 09 '18 at 20:16
  • 1
    n a sample .fm file, as well as in your screenshot, I do not see the string version or rev (except possibly for what you say your regex inserted) which is what your regex is searching for. So what, exactly, do you mean by those terms? – Ron Rosenfeld Mar 10 '18 at 01:58
  • Thanks for answering! I can provide you more infos and a sample file on Monday when I'm back at work. – gwc Mar 10 '18 at 11:49
  • I just edited a sample file and additional information into the original post. – gwc Mar 12 '18 at 09:01

2 Answers2

1

I found a solution with using ADODB.streams. This works fine:

Sub test_binary()
    Dim regEx As Object

    Dim buffer As String
    Dim filename As String
    Dim matchPattern As String
    Dim result As String

    Set regEx = CreateObject("VBScript.RegExp")

    filename = "C:\test.fm"

    With CreateObject("ADODB.Stream")
        .Open
        .Type = 2
        .Charset = "utf-8"
        .LoadFromFile filename
        buffer = .Readtext(10000)
        .Close
    End With

    matchPattern = "Version - Date.+?(\d{1,2})[\s\S]*Rev.+?(\d{1,2})"

    result = regExSearch(buffer, matchPattern)

    MsgBox result
End Sub

regex function:

Function regExSearch(ByVal strInput As String, ByVal strPattern As String) As String
    Dim regEx As New RegExp

    Dim result As String
    Dim match As Variant
    Dim matches As Variant
    Dim subMatch As Variant

    Set regEx = CreateObject("VBScript.RegExp")

    If strPattern <> "" Then
        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = False
            .Pattern = strPattern
        End With

        If regEx.test(strInput) Then
            Set matches = regEx.Execute(strInput)

            result = ""
            For Each match In matches
                If match.SubMatches.Count > 0 Then
                    For Each subMatch In match.SubMatches
                        If Len(result) > 0 Then
                            result = result & "||"
                        End If
                        result = result & subMatch
                    Next subMatch
                End If
            Next match

            regExSearch = result
        Else
            regExSearch = "err_nomatch"
        End If
    End If

    Set regEx = Nothing
End Function

It is important to open the *.fm file as text file (.Type = 2) and set the charset to "utf-8". Otherwise I wont have plain text for my regular expression to read.

Thank you very much for bringing me on the right way!

gwc
  • 53
  • 4
0

Just save the FM file as MIF. It is a text encoding of the FM file and can be converted back and forth with no loss of information.

ReedError
  • 1
  • 1