2

I have an Office 2007 .XLSX file containing over 5000 records like the below (a single cell with multiple rows of text). The issue: on an adjacent cell, put count of incidents from the cell. Looking at the cell data for A1 you can see 3 incidents:

Cell A1:

1/15/2013 1:30:11 AM Userx
Had to reboot system
1/15/2013 1:32:11 AM Userx
System running finished rebooting and appears to be working
11/15/2013 12:30:11 AM Userx
System hung again

The problem is that the date value isn't consistent. The day, month, and hour can be single or double digit, but they are always noted on a new line.

My code solution was to take the cell, split it on line break, trim everything 5 characters after the last ':' and evaluate the the result against my regular expression. After that, some basic tally and text insert to adjacent cell.

The below is an example of how the function would be called.

'calling function from another source:

thecount = CountOfDateValues(Range("a1").Value) 'get count
Range("b1").Value = thecount 'put count to adjacent cell

Is there any code that will take the string value and return a count of matches to the regular expression?

Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
James Burt
  • 25
  • 1
  • 1
  • 6
  • 1
    since you want to share your knowledge to a specific problem you should take the sollution part from the question and post it as an answer (and accept it if it's the best sollution) – CSᵠ Mar 21 '13 at 17:55
  • Hi @james-burt , what is the point you'd like to know about this issue? In fact, it has both the question and the solution, and I have not seen the question you're making here... Could you clarify this post, please? – Octopus Mar 21 '13 at 17:56
  • 1
    if this is always the same format, couldn't you just count the (number of carriage returns + 1)/2 – scott Mar 21 '13 at 18:17
  • @ka: I planned and tried to when I first posted, however I was limited from answering my question for 8 hours due to stackoverflow policies and had to wait to answer it. I'll be marking it answered here in a moment. – James Burt Mar 22 '13 at 20:10
  • @Octopus: as stated, there was no question, the post was informational as despite hours of searching, I could find no solution relevant to my issue. – James Burt Mar 22 '13 at 20:11
  • @Scott: the blob of data is not "one line time stamp, one line note" as seen in the sample provided, an actual entry is 10-100's of lines of case note data containing any number of chr(10)'s per entry the only pattern is that each new entry begins on a new line starting with date time am/pm. – James Burt Mar 22 '13 at 20:11

2 Answers2

4

You can also include newlines in the Pattern expression by using \n. this way, you don't have to split the text in an array:

Private Function String_CountRegExp_Debug()

    'Input of the test text
    Dim TestText As String
    TestText = "1/15/2013 1:30:11 AM Userx" & vbNewLine & _
            "Had to reboot system" & vbNewLine & _
            "1/15/2013 1:32:11 AM Userx" & vbNewLine & _
            "System running finished rebooting and appears to be working" & vbNewLine & _
            "11/15/2013 12:30:11 AM Userx" & vbNewLine & _
            "System hung again"

    'Input of the Pattern
    Dim RE_Pattern As String
    RE_Pattern = "(\d{1,2})\/(\d{1,2})\/(\d{4})\s(\d{1,2}):(\d{1,2}):(\d{1,2})\s([A,P]M).*\n"

    Debug.Print String_CountRegExp(TestText, RE_Pattern)

End Function

Public Function String_CountRegExp(Text As String, Pattern As String) As Long
'Count the number of Pattern matches in a string.

    'Set up regular expression object
    Dim RE As New RegExp
    RE.Pattern = Pattern
    RE.Global = True
    RE.IgnoreCase = True
    RE.MultiLine = True
    'Retrieve all matches
    Dim Matches As MatchCollection
    Set Matches = RE.Execute(Text)
    'Return the corrected count of matches
    String_CountRegExp = Matches.Count

End Function
sir KitKat
  • 177
  • 1
  • 2
  • 11
0

Below is the VBA code of a function that takes in a string value and returns count of matches to the regular expression. I hope it is of use for someone.

Function CountOfDateValues(thetext)

Dim data() As String 
Dim yourInput As String
yourInput = thetext 
Dim TheSplitter As String
TheSplitter = Chr(10) 'the character that represents a line break

data = Split(yourInput, TheSplitter ) ' creates an array of strings for each line in the cell
Dim re
Set re = CreateObject("VBscript.regexp")
'regular expression that matches ##/##/#### ##:##:## ##
re.Pattern = "(?=\d)^(?:(?!(?:10\D(?:0?[5-9]|1[0-4])\D(?:1582))|(?:0?9\D(?:0?[3-9]|1[0-3])\D(?:1752)))((?:0?[13578]|1[02])|(?:0?[469]|11)(?!\/31)(?!-31)(?!\.31)|(?:0?2(?=.?(?:(?:29.(?!000[04]|(?:(?:1[^0-6]|[2468][^048]|[3579][^26])00))(?:(?:(?:\d\d)(?:[02468][048]|[13579][26])(?!\x20BC))|(?:00(?:42|3[0369]|2[147]|1[258]|09)\x20BC))))))|(?:0?2(?=.(?:(?:\d\D)|(?:[01]\d)|(?:2[0-8])))))([-.\/])(0?[1-9]|[12]\d|3[01])\2(?!0000)((?=(?:00(?:4[0-5]|[0-3]?\d)\x20BC)|(?:\d{4}(?!\x20BC)))\d{4}(?:\x20BC)?)(?:$|(?=\x20\d)\x20))?((?:(?:0?[1-9]|1[012])(?::[0-5]\d){0,2}(?:\x20[aApP][mM]))|(?:[01]\d|2[0-3])(?::[0-5]\d){1,2})?$"
re.Global = True

Dim t As String
Dim theCount As Integer
theCount = 0
For i = LBound(data) To UBound(data) 'from first item in array to last item in array

        For Each Match In re.Execute(Left(data(i), InStrRev(data(i), ":") + 5))
            'from start of string to 5 characters past the last ':' of string
            theCount = theCount + 1
        Next
    Next

CountOfDateValues = theCount 

End Function

Referencing urls:

MS Access 2003 VBA String Split on Line Break

http://sourceforge.net/projects/regexbuilder/files/regexbuilder/1.4.0/

This tool made testing my regular expression against various date formats remarkably easy.

http://regexlib.com/DisplayPatterns.aspx?cattabindex=4&categoryid=5&p=2

I was able to save a lot of time crafting a regular expression by using a precrafted one from here. Sadly did not learn much by do1ing so, but I believe I saved a lot of time on this 'we need it done now' request.

*Note: There is a window for a false positives if someone starts their worklog note with a timetamp, I noted this to the customer and they were fine with it.

Community
  • 1
  • 1
James Burt
  • 25
  • 1
  • 1
  • 6