8

Assume I have a block of data in Excel 2010, 100 rows by 3 columns.

Column C contains some duplicates, say it starts off as

1, 1, 1, 2, 3, 4, 5, ..... , 97, 98

Using VBA, I would like to remove the duplicate rows so I am left with 98 rows and 3 columns.

1, 2, 3, ..... , 97, 98

I know there is a button in Excel 2010 to do that but it inteferes with the rest of my code subsequently and gives incorrect results.

Furthermore, I would like to do it in arrays, then paste the results on the worksheet, rather than methods such as Application.Worksheetfunction.countif(.....

So something like:

Dim myarray() as Variant
myarray=cells(1,1).Currentregion.value

Dim a as Long

For a=1 to Ubound(myarray,1)

    'something here to 

Next a
Community
  • 1
  • 1
Yht H
  • 81
  • 1
  • 1
  • 2
  • As OP wanted a VBA solution close to RemoveDuplicates retaining related array rows, I posted a late reply *►"Remove duplicates (plus related row items) from array"* – T.M. Oct 25 '20 at 20:02

9 Answers9

8

I answered a similar question. Here is the code I used:

Dim dict As Object
Dim rowCount As Long
Dim strVal As String

Set dict = CreateObject("Scripting.Dictionary")

rowCount = Sheet1.Range("A1").CurrentRegion.Rows.Count

'you can change the loop condition to iterate through the array rows instead
Do While rowCount > 1
  strVal = Sheet1.Cells(rowCount, 1).Value2

  If dict.exists(strVal) Then
    Sheet1.Rows(rowCount).EntireRow.Delete
  Else
    'if doing this with an array, then add code in the Else block
    ' to assign values from this row to the array of unique values
    dict.Add strVal, 0
  End If

  rowCount = rowCount - 1
Loop

Set dict = Nothing

If you want to use an array, then loop through the elements with the same conditional (if/else) statements. If the item doesn't exist in the dictionary, then you can add it to the dictionary and add the row values to another array.

Honestly, I think the most efficient way is to adapt code you'd get from the macro recorder. You can perform the above function in one line:

    Sheet1.UsedRange.RemoveDuplicates Columns:=3, Header:=xlYes
Community
  • 1
  • 1
Zairja
  • 1,441
  • 12
  • 31
  • 4
    This delete actual rows from the sheet, whereas the question was to remove duplicates in VBA. Plus any row deleteion should always occur going from bottom up, to avoid skipping rows – brettdj Aug 08 '12 at 23:12
  • @brettdj Both bits of code will work to remove duplicates. The asker wanted to do it in VBA, preferably with arrays (in which case he/she can easily modify the loop to iterate through rows of the array instead of a range, then add only unique elements to a separate array). If you look at the code, you'll see that the row deletion does go from bottom up (`rowCount` gets decremented). :) – Zairja Aug 09 '12 at 13:00
  • Well the question is titled *VBA, remove duplicates from array*. You make a good point that your code does go bottom up the way you use `.Count` although in this case the asker may need to specify if the first occurrence should be kept at the top or botttom. – brettdj Aug 10 '12 at 00:39
  • I've edited my answer slightly to reflect the array request. Inside the question it gets ambiguous and the user states the desire to do what clicking the Remove Duplicates button does but in VBA (with an array). The built-in Remove Duplicates also works from bottom up. If the user comes back and wants a more thorough answer or finds this one doesn't help them figure one out, then I can provide more. Feel free to edit mine or add your own, as well. Thanks for the critique! :) – Zairja Aug 10 '12 at 12:56
  • Is this Scripting.Dictionary method works in Mac as well? – Roshantha De Mel Aug 19 '17 at 11:51
8
Function eliminateDuplicate(poArr As Variant) As Variant
    Dim poArrNoDup()

    dupArrIndex = -1
    For i = LBound(poArr) To UBound(poArr)
        dupBool = False

        For j = LBound(poArr) To i
            If poArr(i) = poArr(j) And Not i = j Then
                dupBool = True
            End If
        Next j

        If dupBool = False Then
            dupArrIndex = dupArrIndex + 1
            ReDim Preserve poArrNoDup(dupArrIndex)
            poArrNoDup(dupArrIndex) = poArr(i)
        End If
    Next i

    eliminateDuplicate = poArrNoDup
End Function
Andriy Makukha
  • 7,580
  • 1
  • 38
  • 49
RBILLC
  • 170
  • 2
  • 6
  • 1
    Please do not edit other SO users answers as you have in [this revision](http://stackoverflow.com/revisions/19693896/2) if there is a problem with the code, comment and let them know. – user692942 Aug 14 '14 at 08:33
  • You need to set dupBool to false for each loop of i – 99moorem Jul 22 '15 at 09:31
5

Simple function to remove duplicates from a 1D array

Private Function DeDupeArray(vArray As Variant) As Variant
  Dim oDict As Object, i As Long
  Set oDict = CreateObject("Scripting.Dictionary")
  For i = LBound(vArray) To UBound(vArray)
    oDict(vArray(i)) = True
  Next
  DeDupeArray = oDict.keys()
End Function

Edit:

With stdVBA (a library largely maintained by myself) you can use:

uniqueValues = stdEnumerator.CreateFromArray(myArray).Unique().AsArray()
Sancarn
  • 2,575
  • 20
  • 45
4

An improvement on @RBILLC and @radoslav006 answers, this version searches the array with the duplicates removed for existing values so it searchs less values to find a duplicate.

Public Function RemoveDuplicatesFromArray(sourceArray As Variant)
    Dim duplicateFound As Boolean
    Dim arrayIndex As Integer, i As Integer, j As Integer
    Dim deduplicatedArray() As Variant
    
    arrayIndex = -1
    deduplicatedArray = Array(1)

    For i = LBound(sourceArray) To UBound(sourceArray)
        duplicateFound = False

        For j = LBound(deduplicatedArray) To UBound(deduplicatedArray)
            If sourceArray(i) = deduplicatedArray(j) Then
                duplicateFound = True
                Exit For
            End If
        Next j

        If duplicateFound = False Then
            arrayIndex = arrayIndex + 1
            ReDim Preserve deduplicatedArray(arrayIndex)
            deduplicatedArray(arrayIndex) = sourceArray(i)
        End If
    Next i

    RemoveDuplicatesFromArray = deduplicatedArray
End Function
Darryls99
  • 921
  • 6
  • 11
4

Here's another approach for working with an array:

Sub tester()

    Dim arr, arrout
    
    arr = Range("A1").CurrentRegion.Value   'collect the input array
     
    arrout = UniqueRows(arr)                'get only unique rows
    
    Range("H1").Resize(UBound(arrout, 1), UBound(arrout, 2)).Value = arrout
    
End Sub




Function UniqueRows(arrIn As Variant) As Variant
    Dim keys, rw As Long, col As Long, k, sep, arrout
    Dim dict As Object, lbr As Long, lbc As Long, ubr As Long, ubc As Long, rwOut As Long
    Set dict = CreateObject("scripting.dictionary")
    'input array bounds
    lbr = LBound(arrIn, 1)
    ubr = UBound(arrIn, 1)
    lbc = LBound(arrIn, 2)
    ubc = UBound(arrIn, 2)
    ReDim keys(lbr To ubr)
    'First pass:collect all the row "keys" in an array 
    '    and unique keys in a dictionary
    For rw = lbr To ubr
        k = "": sep = ""
        For col = lbc To ubc
            k = k & sep & arrIn(rw, col)
            sep = Chr(0)
        Next col
        keys(rw) = k     'collect key for this row
        dict(k) = True   'just collecting unique keys
    Next rw

    'Resize output array to # of unique rows
    ReDim arrout(lbr To dict.Count + (lbr - 1), lbc To ubc)
    rwOut = lbr
    'Second pass: copy each unique row to the output array
    For rw = lbr To ubr
        If dict(keys(rw)) Then      'not yet output?
            For col = lbc To ubc    'copying this row over to output...
                arrout(rwOut, col) = arrIn(rw, col)
            Next col
            rwOut = rwOut + 1      'increment output "row"
            dict(keys(rw)) = False 'flag this key as copied
        End If
    Next rw
    UniqueRows = arrout
End Function
Tim Williams
  • 154,628
  • 8
  • 97
  • 125
1

Answer from @RBILLC could be easily improved by adding an Exit For inside internal loop:

Function eliminateDuplicate(poArr As Variant) As Variant
    Dim poArrNoDup()

    dupArrIndex = -1
    For i = LBound(poArr) To UBound(poArr)
        dupBool = False

        For j = LBound(poArr) To i
            If poArr(i) = poArr(j) And Not i = j Then
                dupBool = True
                Exit For
            End If
        Next j

        If dupBool = False Then
            dupArrIndex = dupArrIndex + 1
            ReDim Preserve poArrNoDup(dupArrIndex)
            poArrNoDup(dupArrIndex) = poArr(i)
        End If
    Next i

    eliminateDuplicate = poArrNoDup
End Function
1

I think this is really a case for using excel's native functions, at least for the initial array acquisition, and I don't think there's any simpler way to do it. This sub will output the unique values starting in column 5. I assumed that the target range was empty, so if it's not, change r and c.

Sub testUniques()
    
    Dim arr, r As Long, c As Long, h As Long, w As Long
    Dim this As Worksheet: Set this = ActiveSheet
    arr = Application.Unique(this.Cells(1, 1).CurrentRegion)
    
    r = 1
    c = 5
    h = UBound(arr, 1) - 1
    w = UBound(arr, 2) - 1
    
    this.Range(this.Cells(r, c), this.Cells(r + h, c + w)) = arr
    
End Sub
Chris Strickland
  • 3,388
  • 1
  • 16
  • 18
0

I know this is old, but here's something I used to copy duplicate values to another range so that I could see them quickly to establish data integrity for a database I was standing up from various spreadsheets. To make the procedure delete the duplicates it would be as simple as replacing the dupRng lines with Cell.Delete Shift:=xlToLeft or something to that effect.

I haven't tested that personally, but it should work.

Sub PartCompare()
    Dim partRng As Range, partArr() As Variant, i As Integer
    Dim Cell As Range, lrow As Integer

    lrow = ThisWorkbook.Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
    i = 0

    Set partRng = ThisWorkbook.Worksheets("Sheet1").Range(Cells(1, 1), Cells(lrow, 1))

    For Each Cell In partRng.Cells
        ReDim Preserve partArr(i)
        partArr(i) = Cell.Value
        i = i + 1
    Next

    Dim dupRng As Range, j As Integer, x As Integer, c As Integer

    Set dupRng = ThisWorkbook.Worksheets("Sheet1").Range("D1")

    x = 0
    c = 1
    For Each Cell In partRng.Cells
        For j = c To UBound(partArr)
            If partArr(j) = Cell.Value Then
                dupRng.Offset(x, 0).Value = Cell.Value
                dupRng.Offset(x, 1).Value = Cell.Address()
                x = x + 1
                Exit For
            End If
        Next j
        c = c + 1
    Next Cell
End Sub
TOTM
  • 107
  • 7
0

Remove duplicates (plus related row items) from array

As OP wanted a VBA solution close to RemoveDuplicates, I demonstrate an array approach using a ►dictionary to get not the unique items per se (dict.keys), but the related row indices of first occurrencies (dict.items).

These are used to retain the whole row data via procedure LeaveUniques profiting from the advanced possibilities of the ►Application.Index() function - c.f. Some peculiarities of the the Application.Index function

Example Call

Sub ExampleCall()
'[0]define range and assign data to 1-based 2-dim datafield
    With Sheet1                   ' << reference to your project's sheet Code(Name)
        Dim lastRow: lastRow = .Cells(.Rows.Count, "C").End(xlUp).Row
        Dim rng:  Set rng = .Range("C2:E" & lastRow)
    End With
    Dim data: data = rng        ' assign data to 2-dim datafield
'[1]get uniques (column 1) and remove duplicate rows
    LeaveUniques data           ' << call procedure LeaveUniques (c.f. RemoveDuplicates)
'[2]overwrite original range
    rng.Clear
    rng.Resize(UBound(data), UBound(data, 2)) = data
End Sub

Procedure LeaveUniques

Sub LeaveUniques(ByRef data As Variant, Optional ByVal colNum As Long = 1)
'Purpose: procedure removes duplicates of given column number in entire array
    data = Application.Index(data, uniqueRowIndices(data, colNum), nColIndices(UBound(data, 2)))
End Sub

Help functions to LeaveUniques

Function uniqueRowIndices(data, Optional ByVal colNum As Long = 1)
'Purpose: return data index numbers referring to uniques
'a) set late bound dictionary to memory
    Dim dict As Object
    Set dict = CreateObject("Scripting.Dictionary")
'b) slice e.g. first data column (colNum = 1)
    Dim colData
    colData = Application.Index(data, 0, colNum)
'c) fill dictionary with uniques referring to first occurencies
    Dim i As Long
    For i = 1 To UBound(colData)
        If Not dict.exists(dict(colData(i, 1))) Then dict(colData(i, 1)) = i
    Next
'd) return 2-dim array of valid unique 1-based index numbers
    uniqueRowIndices = Application.Transpose(dict.items)
End Function

Function nColIndices(ByVal n As Long)
'Purpose: return "flat" array of n column indices, e.g. for n = 3 ~> Array(1, 2, 3)
    nColIndices = Application.Transpose(Evaluate("row(1:" & n & ")"))
End Function

T.M.
  • 9,436
  • 3
  • 33
  • 57