1

I have this VBA module in my Excel spreadsheet that attempts to clean up the date data, which contains various issues with text being combined with date information. Here is my main load function:

Public lstrow As Long, strDate As Variant, stredate As Variant
Sub importbuild()
lstrow = Worksheets("Data").Range("G" & Rows.Count).End(xlUp).Row

Function DateOnlyLoad(col As String, col2 As String, colcode As String)

Dim i As Long, j As Long, k As Long

j = Worksheets("CI").Range("A" & Rows.Count).End(xlUp).Row + 1
k = Worksheets("Error").Range("A" & Rows.Count).End(xlUp).Row + 1

For i = 2 To lstrow

strDate = spacedate(Worksheets("Data").Range(col & i).Value)
stredate = spacedate(Worksheets("Data").Range(col2 & i).Value)

If (Len(strDate) = 0 And (col2 = "NA" Or Len(stredate) = 0)) Or InStr(1, 
UCase(Worksheets("Data").Range(col & i).Value), "EXP") > 0 Then
 GoTo EmptyRange

Else

Worksheets("CI").Range("A" & j & ":C" & j).Value = 
 Worksheets("Data").Range("F" & i & ":H" & i).Value
Worksheets("CI").Range("D" & j).Value = colcode
Worksheets("CI").Range("E" & j).Value = datecleanup(strDate)
'Worksheets("CI").Range("L" & j).Value = dateclean(strDate)
Worksheets("CI").Range("F" & j).Value = strDate

If col2 <> "NA" Then
    If IsEmpty(stredate) = False Then
        Worksheets("CI").Range("F" & j).Value = datecleanup(stredate)
    End If
End If
j = j + 1

End If

EmptyRange:

Next i

End Function

datecleanup function:

Function datecleanup(inputdate As Variant) As Variant

If Len(inputdate) = 0 Then
 inputdate = "01/01/1901"
Else
  If Len(inputdate) = 4 Then
    inputdate = "01/01/" & inputdate
  Else
    If InStr(1, inputdate, ".") Then
        inputdate = Replace(inputdate, ".", "/")
    End If

 End If
End If

datecleanup = Split(inputdate, Chr(32))(0)

Sample Output:

 Column A   Column B      Column C     Column D    Column E    Column F
  125156    Wills, C     11/8/1960     MMR1         MUMPS       MUMPS TITER 02/26/2008 POSITIVE     
  291264    Balti, L     09/10/1981    MMR1        (blank)      Measles - 11/10/71 Rubella 
  943729    Barnes, B    10/10/1965    MMR1         MUMPS       MUMPS TITER 10/08/2008 POSITIVE

The Split separates the date from the subsequent text and this works fine, however if there is text that occurs before the date then the output contains the first part of the text. I would like to get only the date (if it exists) from the string and display that, regardless of where it falls in the string. Below are sample results: Column E is the output from the Split logic, Column F is the entire string that is being evaluated from the other worksheet.

Desired Output from above examples: (Column E has correct dates extracted)

Column A   Column B      Column C     Column D    Column E        Column F
  125156    Wills, C     11/8/1960     MMR1       02/26/2008      MUMPS TITER 02/26/2008 POSITIVE       
  291264    Balti, L     09/10/1981    MMR1       11/10/71        Measles - 11/10/71 Rubella 
  943729    Barnes, B    10/10/1965    MMR1       10/08/2008      MUMPS TITER 10/08/2008 POSITIVE

What else can I add into my datecleanup function to further refine this? Thanks in advance!

Community
  • 1
  • 1
JBinson88
  • 113
  • 1
  • 14

1 Answers1

3

Avoiding a regex, such as in the way suggested in comments is usually a good idea, but in for a penny, in for a pound:

① Use a regex mm/dd/yyyy

(0[1-9]|1[012])[- \/.](0[1-9]|[12][0-9]|3[01])[- \/.](19|20)[0-9]{2}

That pattern comes from ipr101's answer, and proposes a good regex for validating an actual date for mm/dd/yyyy. I have adjusted to correctly escape a couple of characters.

Regex

You would need to adjust if can be less digits or different format. Some examples given below.

You could use the function below as:

Worksheets("CI").Range("F" & j).Value = RemoveChars(datecleanup(stredate))

Example test:

Option Explicit

Public Sub test()
    Debug.Print RemoveChars("Measles - 11/10/1971 Rubella")
End Sub

Public Function RemoveChars(ByVal inputString As String) As String

    Dim regex As Object, tempString As String
    Set regex = CreateObject("VBScript.RegExp")

    With regex
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = "(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)[0-9]{2}"
    End With

    If regex.test(inputString) Then
        RemoveChars = regex.Execute(inputString)(0)
    Else
        RemoveChars = inputString
    End If

End Function

② For dd/mm/yyyy use:

(0[1-9]|[12][0-9]|3[01])[- \/.](0[1-9]|1[012])[- \/.](19|20)[0-9]{2}

Regex2


③ And more flexible in case of single day or month (day before month), use:

([1-9]|[12][0-9]|3[01])[- \/.](0?[1-9]|1[012])[- \/.][0-9]{2,4}

Regex3

You get the idea.

Note:

You can always use something generic like (\d{1,2}\/){2}\d{2,4}, and then validate the function return string with ISDATE(return value).

QHarr
  • 83,427
  • 12
  • 54
  • 101
  • Thanks very much for your suggestions! So in my main function above (DateOnlyLoad) I can reference `Worksheets("CI").Range("F" & j).Value = RemoveChars(datecleanup(stredate))` and then `strdate` will in an input to `datecleanup` function, which will be input to `RemoveChars` function? So I should still keep the `datecleanup` function as-is? Regarding your additional patterns you provided (I only need MM/DD/YYYY and your third pattern) how would I incorporate your third pattern (`([1-9]|[12][0-9]|3[01])[- \/.](0?[1-9]|1[012])[- \/.][0-9]{2,4}` ) into the `RemoveChars` function? – JBinson88 May 19 '18 at 14:01
  • Right...use the function as I showed in the answer. Incorporating the other pattern...depends what you are actually after.... I assume you are not allowing both patterns but one or the other at any given time so simply swop the regex .Pattern = "pattern" <= remove the existing and put the new one in. And I think the third regex you actually want to change as I think you want to capture m/d/yy upto mm/dd/yyyy? – QHarr May 19 '18 at 17:05
  • TBH..once you have extracted the pattern you can check it's a valid date with the normal Excel functions , so maybe use something generic like (\d{1,2}\/){2}\d{2,4} and then validate that with ISDATE(pattern match). – QHarr May 19 '18 at 18:29
  • I added in your function and ran it in my macro and I am getting back '1/1/1901' for all the dates, when there is a valid date in the date/text string. What could be causing this? Is there anything on the `datecleanup` function that I should change? Thanks again. – JBinson88 May 19 '18 at 20:08
  • To add to my last comment, it looks like in the datecleanup function it's seeing the Len(inputdate) = 0 so it is putting in the hard coded date 1/1/1901. I confirmed this by changing the hard coded date to 1/1/1902 and now all my cells from the RemoveChars function return back 1/1/1902. What do you think needs to be changed? – JBinson88 May 19 '18 at 20:17
  • QHarr, another update; In the DateOnlyLoad function I changed `Worksheets("CI").Range("F" & j).Value = RemoveChars(datecleanup(stredate))` to `Worksheets("CI").Range("N" & j).Value = RemoveChars(datecleanup(**strDate**))` and I am now getting actual dates. However I did notice that this function isn't picking up all dates from date/text string. An example I have is the strdate value 'MUMPS TITER 2/14/14 POSITIVE' , however the RemoveChars function is outputting as 'MUMPS'. – JBinson88 May 19 '18 at 20:28
  • Is stredate actually MUMPS TITER 02/26/2008 POSITIVE for example? In which case, no need for date cleanup function just put direct into my function. – QHarr May 19 '18 at 20:28
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/171399/discussion-between-qharr-and-jbinson88). – QHarr May 19 '18 at 20:31
  • `strdate` (not stredate) contains examples like 'MUMPS TITER 02/26/2008 POSITIVE' and 'MUMPS 2/14/14 POSITIVE', and other slightly different variations that includes the date in the string. What are you suggesting I do? I'm still getting output from the RemoveChars function as "MUMPS" for the example "MUMPS TITER 01/20/2012 POSITIVE" – JBinson88 May 19 '18 at 20:57
  • I tried your suggestion of just calling your function directly ( – JBinson88 May 19 '18 at 22:47
  • I tried your suggestion of just calling your function directly `Worksheets("CI").Range("N" & j).Value = RemoveChars(strDate)` instead of `Worksheets("CI").Range("N" & j).Value = RemoveChars(datecleanup(strDate))` however now I have an issue for strings where the date begins first, e.g. '8/30/12 - Mumps Titer - POSITIVE' is being output as '8/30/12 - Mumps Titer - POSITIVE'. I think I need a combination of these two functions (datecleanup and RemoveChars, but perhaps the sequence needs to be changed? datecleanup is extracting the date correctly only when the date occurs first in the string. – JBinson88 May 19 '18 at 22:56
  • just wondering if you have any other suggestions based on the above? Really appreciate the help. – JBinson88 May 20 '18 at 20:47