5

Imagine you have two sets of data and the number of rows and columns are the same. Now you want check if data in cells in one set is equal to data in cells with the same relative address in the other set. If thats true for all cells of a row, remove the row from both sets. I can code this very easily by comparing each cell and that's not good for large data sets. See code below for two columns where the two sets of data happen to be in the same sheet side by side with 300 in column offset between them.

Dim RngOb As Range
Dim c As Range

Range("A1", "B1").Select
set RngOb = Range(Selection, Selection.End(xlDown))

For Each c In RngOb.Rows
    If c.Cells(1,1).Value = c.Offset(0, 300).Cells(1,1).Value Then
        If c.Cells(1,2).Value = c.Offset(0, 300).Cells(1,2).Value Then    
            c.EntireRow.Delete
        End If
    End If
Next

My actual data has more than 100 columns and different number of columns from day to day. I'm looking for a smart, fast way to do this for large data sets. I highly appriciate answers, feedback and criticism. :D

steinbitur
  • 331
  • 4
  • 9
  • 18
  • I would add temporary column with formula `=AND(A1=L1,B1=M1)` and then filter this column on TRUE values and remove all visible rows. See @brettdj's answer for details: http://stackoverflow.com/questions/21774624/delete-specifics-rows-in-a-sheet-with-more-than-60000-rows/21775027#21775027 – Dmitry Pavliv Mar 08 '14 at 14:50
  • Are you sure you want to remove the entire row? What if the data sets overlap? If you have 100 rows, but DS 2 starts at row 11, and row 20 in DS1 is the same as row 31 in DS2, then you will be removing row 9 of DS2 when you remove row 20 of DS1. – Ron Rosenfeld Mar 08 '14 at 14:51
  • the column offset would have to be larger than the number of columns if the data is side by side like in my example – steinbitur Mar 08 '14 at 14:53
  • @steinbitur Is there data in the rows between the columns you are testing? and is the offset between columns the same each day? – Ron Rosenfeld Mar 08 '14 at 15:20
  • no, its just two data sets, the same size side by side, with some spacing between them – steinbitur Mar 08 '14 at 15:23
  • What about the other hundred columns? – Ron Rosenfeld Mar 08 '14 at 15:36
  • In the example each data set has two columns. Between them are 288 empty columns. If the datasets were 120 columns. Between them would be 180 empty columns. I'm looking for a way to do this without looping and without checking cell by cell, for example. The position of the data doesn't matter if the code is modified accordingly. – steinbitur Mar 08 '14 at 15:43
  • I just wan't to find out what's the fastest way to check if two row ranges are equal for a bunch of pairs of rows, that's it :D – steinbitur Mar 08 '14 at 15:50
  • I don't think you can avoid looping. I would suggest using VBA arrays. Much faster than going back and forth to the worksheet. You could keep track of the rows to delete in a collection; then create a results array; populate it; and write it back to the worksheet. – Ron Rosenfeld Mar 08 '14 at 15:53
  • 5
    you can use [THIS](http://stackoverflow.com/questions/19395633/how-to-compare-two-entire-rows-in-a-sheet/19396257#19396257) beautiful piece of code by @TimWilliams to compare ranges – Siddharth Rout Mar 08 '14 at 15:56
  • This question is about comparing range _contents_ and not ranges. – Lummo Feb 16 '20 at 22:40

2 Answers2

13

Here is a simple way to compare two rows in isomorphic ranges.............in this example row #5 of each range:

Sub RowCompare()
    Dim ary1() As Variant
    Dim Range1 As Range, Range2 As Range, rr1 As Range, rr2 As Range
    Set Range1 = Range("B9:F20")
    Set Range2 = Range("I16:M27")
    Set rr1 = Range1.Rows(5)
    Set rr2 = Range2.Rows(5)
    ary1 = Application.Transpose(Application.Transpose(rr1))
    ary2 = Application.Transpose(Application.Transpose(rr2))
    st1 = Join(ary1, ",")
    st2 = Join(ary2, ",")
    If st1 = st2 Then
        MsgBox "the same"
    Else
        MsgBox "different"
    End If
End Sub

If you have embedded commas in the cells, then choose another character in the JOIN

Gary's Student
  • 95,722
  • 10
  • 59
  • 99
1

If I understand your problem correctly, the following code should allow you to do what you want. Within the code, you select the range you wish to process; the first column of each data set, and the number of columns within each data set.

It does assume only two data sets, as you wrote, although that could be expanded. And there are ways of automatically determining the dataset columns, if there is no other data in between.

Option Explicit
Option Base 0
Sub RemoveDups()
    Dim I As Long, J As Long
    Dim rRng As Range
    Dim vRng As Variant, vRes() As Variant
    Dim bRng() As Boolean
    Dim aColumns, lColumns As Long
    Dim colRowsDelete As Collection

'vRng to include from first to last column to be tested
Set rRng = Range("f1", Cells(Rows.Count, "F").End(xlUp)).Resize(columnsize:=100)
vRng = rRng
ReDim bRng(1 To UBound(vRng))

'columns to be tested
'Specify First column of each data set
aColumns = Array(1, 13)

'num columns in each data set
lColumns = 3

For I = 1 To UBound(vRng)
    bRng(I) = vRng(I, aColumns(0)) = vRng(I, aColumns(1))
    For J = 1 To lColumns - 1
        bRng(I) = bRng(I) And (vRng(I, aColumns(0) + J) = vRng(I, aColumns(1) + J))
    Next J
Next I

'Rows to Delete
Set colRowsDelete = New Collection
For I = 1 To UBound(bRng)
    If bRng(I) = True Then colRowsDelete.Add Item:=I
Next I

'Delete the rows
If colRowsDelete.Count > 0 Then
Application.ScreenUpdating = False
    For I = colRowsDelete.Count To 1 Step -1
        rRng.Rows(colRowsDelete.Item(I)).EntireRow.Delete
    Next I
End If
Application.ScreenUpdating = True
End Sub
Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60