1

I am trying to do the following in Excel:

I have a sheet with some data (400k rows, which is why I used long for the variables instead of integers) and I want to check Column R (which contains ID's) and need to check then against Columns S and T. If R is the same and S and T is different, the code should copy the entire row and paste it in another sheet. The code runs and pastes something but not the correct rows. Thanks in advance, any help would be highly appreciated.

Sample Data

R           S       T
1234    Kevin   Smith
2345    John    Miller
1234    Carl    Jones
1234    Kevin   Smith
4567    Mike    Redwood
2058    William Wales

Code

Sub mySub1()
    Set wb = ThisWorkbook
    Set tbl = wb.Sheets("sheet1")
    Dim lrow As Long
    Dim i As Long
    Dim x As Long
    Dim y As Long
    Dim cell As Range

    i = 1
    x = 0
    y = 1

    Sheets("sheet1").Activate

    lrow = tbl.Cells(Rows.Count, "A").End(xlUp).Row

    For Each cell In Range("R2:R" & lrow)
        If cell.Offset(x, 0).Value = cell.Offset(i, 0).Value And _
        cell.Offset(0, 1) <> cell.Offset(i, 1).Value And _
        cell.Offset(0, 2).Value <> cell.Offset(i, 2).Value Then
            ActiveSheet.Range(Cells(i + 1, 1), Cells(i + 1, 26)).Select
            Selection.Copy
            Sheets("sheet2").Select
            ActiveSheet.Cells(y, 1).PasteSpecial
            y = y + 1
        End If
        Sheets("sheet1").Activate
        i = i + 1
        x = x + 1
    Next
End Sub
Siddharth Rout
  • 147,039
  • 17
  • 206
  • 250
DEFCON123
  • 37
  • 2
  • 2
  • 9

2 Answers2

0

Ok I tried different methods on 400k rows. This is the one which I found the fastest.

Logic:

  1. Copy the data to a temp sheet and then remove duplicates.
  2. Sort the data
  3. Store the resulting range in an array
  4. Loop and do the match and finally copy

I am assuming that the data in Sheet1 doesn't have headers. If it does then change Header:=xlNo to Header:=xlYes and modify the for loops.

IMP: Can't use Autofilter or worksheet functions like Countif because of the number of rows.

Code:

Sub Sample()
    Dim wsI As Worksheet, wsO As Worksheet, wsTemp As Worksheet
    Dim wsILRow As Long, wsOLRow As Long
    Dim rng As Range
    Dim itm As String
    Dim Myar

    Set wsI = ThisWorkbook.Sheets("Sheet1")
    Set wsO = ThisWorkbook.Sheets("Sheet2")
    Set wsTemp = ThisWorkbook.Sheets.Add

    wsOLRow = wsO.Range("A" & wsO.Rows.Count).End(xlUp).Row + 1

    wsI.Cells.Copy wsTemp.Cells

    With wsTemp
        wsILRow = .Range("R" & .Rows.Count).End(xlUp).Row

        .Range("$R$1:$T$" & wsILRow).RemoveDuplicates Columns:=Array(1, 2, 3), _
        Header:=xlNo

        .Columns("A:Z").Sort Key1:=.Range("R1"), Order1:=xlAscending, Header:=xlNo, _
        OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
        DataOption1:=xlSortNormal

        wsILRow = .Range("R" & .Rows.Count).End(xlUp).Row

        Set rng = .Range("R1:T" & wsILRow)
    End With

    Myar = rng.Value

    For i = 1 To UBound(Myar)

        If i > 1 Then If Myar(i, 1) = Myar(i - 1, 1) Then GoTo NextRec

        itm = Myar(i, 1)
        For j = i + 1 To UBound(Myar)
            If Myar(j, 1) = itm Then
                If Myar(i, 2) & Myar(j, 2) <> Myar(i, 3) & Myar(j, 3) Then
                    wsTemp.Rows(j).Copy wsO.Rows(wsOLRow)
                    wsOLRow = wsOLRow + 1
                End If
            End If
        Next j
NextRec:
    Next i

    Application.DisplayAlerts = False
    wsTemp.Delete
    Application.DisplayAlerts = True
End Sub
Siddharth Rout
  • 147,039
  • 17
  • 206
  • 250
  • Sorry for the late response. Thanks Siddarth, it works like a charm! I assume that you haven't run the code on 400k lines?! Can you estimate how long it should take to loop through the whole Sheet? If the data in sheet1 has headers, what would I need to change in the For loop? Setting i to 0? – DEFCON123 Dec 13 '13 at 08:44
  • I did run it on 400k rows and hence I took such a long time to post. While Autofilter/Countif/Union took ages, this was relatively faster. I don't have the sample file anymore. But as I mentioned this was the fastest. – Siddharth Rout Dec 13 '13 at 13:14
  • I ran it aswell and Excel didn't react. About 20 minutes later I just quit Excel and split the original data into smaller pieces of about 30-50k rows which worked. Did it took more then 20 minutes for you to run the whole code? – DEFCON123 Dec 13 '13 at 14:53
  • The above code... no. But then maybe my data was not that complex. I have just created a dummy data of 400k rows – Siddharth Rout Dec 13 '13 at 14:55
0

If you don't have to use VBA, you can do this with simple worksheet manipulation.

Take the worksheet:

  • Append a column containing an increasing row number,
  • Sort by ID (column R), and the row number,
  • Append the formula =AND(R2=R1,OR(S2<>S1,T2<>T1)) to row 2 and copy this down the worksheet,
  • Filter to show all the rows that are true, and
  • Copy visible rows to new worksheet.

This should give you much better performance and be easier to maintain.

Pekka
  • 3,529
  • 27
  • 45