1

I was wondering how to remove duplicate names/text's in a cell. For example

Jean Donea Jean Doneasee 
R.L. Foye R.L. Foyesee 
J.E. Zimmer J.E. Zimmersee 
R.P. Reed R.P. Reedsee  D.E. Munson D.E. Munsonsee 

While googling, I stumbled upon a macro/code, it's like:

Function RemoveDupes1(pWorkRng As Range) As String
'Updateby20140924
Dim xValue As String
Dim xChar As String
Dim xOutValue As String
Set xDic = CreateObject("Scripting.Dictionary")
xValue = pWorkRng.Value
For i = 1 To VBA.Len(xValue)
    xChar = VBA.Mid(xValue, i, 1)
   If xDic.exists(xChar) Then
   Else
      xDic(xChar) = ""
      xOutValue = xOutValue & xChar
   End If
Next
RemoveDupes1 = xOutValue
End Function

The macro is working, but it is comparing every letter, and if it finds any repeated letters, it's removing that.

When I use the code over those names, the result is somewhat like this:

Jean Dos
R.L Foyes
J.E Zimers
R.P edsDEMuno

By looking at the result I can make out it is not what I want, yet I got no clue how to correct the code.

The desired output should look like:

 Jean Donea
 R.L. Foye 
 J.E. Zimmer
 R.P. Reed 

Any suggestions?

Thanks in Advance.

Varsha G
  • 25
  • 6

2 Answers2

1

Input

With the input on the image:

![Input names

Result

The Debug.Print output

Output

Regex

A regex can be used dynamically iterating on the cell, to work as a Find tool. So it will extract only the shortest match. \w*( OUTPUT_OF_EXTRACTELEMENT )\w*, e.g.: \w*(Jean)\w*

The Regex's reference must be enabled.

Code

Function EXTRACTELEMENT(Txt As String, n, Separator As String) As String
    On Error GoTo ErrHandler:
    EXTRACTELEMENT = Split(Application.Trim(Mid(Txt, 1)), Separator)(n - 1)
    Exit Function
ErrHandler:
    ' error handling code
    EXTRACTELEMENT = 0
    On Error GoTo 0
End Function

Sub test()

Dim str As String
Dim objMatches As Object
Set objRegExp = CreateObject("VBScript.RegExp") 'New regexp
lastrow = ActiveSheet.Cells(ActiveSheet.Rows.Count, "A").End(xlUp).Row
For Row = 1 To lastrow
    str = Range("A" & Row)
    F_str = ""
    N_Elements = UBound(Split(str, " "))
    If N_Elements > 0 Then
        For k = 1 To N_Elements + 1
            strPattern = "\w*(" & EXTRACTELEMENT(CStr(str), k, " ") & ")\w*"
            With objRegExp
                .Pattern = strPattern
                .Global = True
            End With
            If objRegExp.test(strPattern) Then
                Set objMatches = objRegExp.Execute(str)
                If objMatches.Count > 1 Then
                    If objRegExp.test(F_str) = False Then
                        F_str = F_str & " " & objMatches(0).Submatches(0)
                    End If
                ElseIf k <= 2 And objMatches.Count = 1 Then
                    F_str = F_str & " " & objMatches(0).Submatches(0)
                End If
            End If
        Next k
    Else
        F_str = str
    End If
    Debug.Print Trim(F_str)
Next Row

End Sub

Note that you can Replace the Debug.Print to write on the target cell, if it is column B to Cells(Row,2)=Trim(F_str)

Explanation

Function

You can use this UDF, that uses the Split Function to obtain the element separated by spaces (" "). So it can get every element to compare on the cell.

Loops

It will loop from 1 to the number of elements k in each cell and from row 1 to lastrow.

Regex

The Regex is used to find the matches on the cell and Join a new string with the shortest element of each match.

danieltakeshi
  • 887
  • 9
  • 37
0

This solution operates on the assumption that 'see' (or some other three-letter string) will always be on the end of the cell value. If that isn't the case then this won't work.

Function RemoveDupeInCell(dString As String) As String
Dim x As Long, ct As Long
Dim str As String

'define str as half the length of the cell, minus the right three characters
str = Trim(Left(dString, WorksheetFunction.RoundUp((Len(dString) - 3) / 2, 0)))

'loop through the entire cell and count the number of instances of str
For x = 1 To Len(dString)
    If Mid(dString, x, Len(str)) = str Then ct = ct + 1
Next x

'if it's more than one, set to str, otherwise error
If ct > 1 Then
    RemoveDupeInCell = str
Else
    RemoveDupeInCell = "#N/A"
End If

End Function
barvobot
  • 887
  • 1
  • 7
  • 17