I'm trying to make comparison between two columns. column A has about 85,000 list and column B has about 270,000. I tried using vlookup and other functions like countif. however, anytime I try any of these functions my excel workbook could take about 45 minutes to an hour to even complete a basic task. please, any suggestion why my excel workbook keeps freezing and slowing down? thanks in advance.
Asked
Active
Viewed 104 times
-2
-
Excel isn't faster than that. You need another solution for that amount of data. – idstam Dec 17 '14 at 14:37
-
1Can you provide an example? – RafaSashi Dec 17 '14 at 14:37
-
1Excel is *much* faster than that. Using Power Query/Power Pivot it's able to use column compression and an OLAP analytic engine that allows it to process several millions of rows. The only (somewhat artificial) limit is that the (compressed) file can't get over 2GB in size, to allow uploading to SharePoint and Office Online document libraries. Without specific formula examples though, it's hard to say what causes the delay – Panagiotis Kanavos Dec 17 '14 at 14:38
-
@pnuts..I am trying to match those two columns. this is one of the formulas I used; IF(COUNTIF(B:B,A1),A1,0)..the formulae is working perfectly just that my workbook keeps slowing down – Bright Dec 17 '14 at 14:45
-
@PanagiotisKanavos.this is one of the formulas I used; IF(COUNTIF(B:B,A1),A1,0) and VlOOKUP(A1,B:B,0,1). any of these formulas work perfectly just that the workbook keeps slowing down. I was thinking may be the RAM could be a factor or – Bright Dec 17 '14 at 14:48
-
@Bright What exactly are you trying to accomplish? It is obvious your current solution is too inefficient for your specific needs. What exactly are you doing to the two columns and you are outputting your result to a third column? – CodeCamper Dec 17 '14 at 15:02
-
@CodeCamper..the goal of my comparison is to find the names of vendors in Column A that can also be found in Column B. but my main problem is why my workbook keeps slowing down. for instance, it could take about 40 minutes to an hour when I'm trying to even save it – Bright Dec 17 '14 at 15:07
-
A macro that reads data into arrays and does the functionality and then spits out the results in one go would be massively faster. I am pretty certain your workbook is taking a long time because of the volatile formulas and the large amount of data you are applying them to – Mark Moore Dec 17 '14 at 15:18
-
Just to add to my above comment, I do this type of Duplicate checking with quite a few worksheets along with many other validations, they are very fast. Need more info for a specific solution, but rough outline. Read Column A data into Array Read Column B Data into an Array Sort Arrays Loop arrays looking for Dupes Depending on what you want to, you can store location of dupes and report them, colour their cells etc – Mark Moore Dec 17 '14 at 15:39
-
@MarkMoore. thanks for your help. I'm trying to write a code as u said. but the VB editor keeps telling me "out of memory" when I press enter after the sub statement. – Bright Dec 17 '14 at 15:44
-
I think your out of memory error is related to your performance issues due to the number of validations. If you remove those it should go away. Based on your criteria listed here, I have added a detailed solution below. It is an example that's quite simplistic, but hopefully it will give you some ideas. – Mark Moore Dec 17 '14 at 22:19
1 Answers
0
The code below is one method of highlighting duplicate data that can be expanded on to make it more complex if required. I have also included the QuickSortArray procedure I use to sort my arrays for this kind of work, the sort proc is not my work, and the credits are in the comments. Code is pretty well documented, so hope it helps
'---------------------------------------------------------------------------------------
' Procedure : DuplicateCheck
' Author : Mark Moore
' Date : 17/12/2014
' Purpose : This procedure is an simple example of duplicate checking multiple columns of data
' What is checked can simply be 2 columns of data, or can also be as complex as the developer
' wants to code it i.e. You could use this same method to duplicate compound values, simply by
' populating the array with concatenated data. Likewise the data that is reported can easily be
' made richer by adding data you may wish to report on. eg, Sheet name could be added for each value
' if you wanted to compare data across sheets.
'---------------------------------------------------------------------------------------
'
Sub DuplicateCheck()
Dim ColAArray, ColBArray, CombinedArray, DupesArray As Variant
Dim ArrayRange As String
Dim FirstRowInArray, LastRowInArray, loopcounter, DataRowNumber, DupesCounter As Long
'define the range of list A by going from A1 to last populated cell in col A
ArrayRange = "A1" & ":A" & Sheets("Sheet1").Range("A" & Rows.Count).End(xlUp).Row
'Assign values to array
ColAArray = Worksheets("Sheet1").Range(ArrayRange).Value ' read all the values at once from the Excel grid, put into an array
'define the range of list B by going from B1 to last populated cell in col B
ArrayRange = "B1" & ":B" & Sheets("Sheet1").Range("B" & Rows.Count).End(xlUp).Row
'Assign values to array
ColBArray = Worksheets("Sheet1").Range(ArrayRange).Value ' read all the values at once from the Excel grid, put into an array
'We are going to combine the 2 arrays and add cell addresses, so redefine the combined array to size of both arrays together
ReDim CombinedArray(LBound(ColAArray, 1) To UBound(ColAArray) + UBound(ColBArray, 1), 1 To 2)
'In theory every value could be duplicated, so size the duplicate array to combined size also
ReDim DupesArray(LBound(CombinedArray, 1) To UBound(CombinedArray, 1), 1 To 2)
'This is the row number of the data, in our case its 1 because our data range started at row 1 - change if you change the data range
DataRowNumber = 1
'First populate the ColA array into the combined array
For loopcounter = LBound(ColAArray, 1) To UBound(ColAArray, 1)
CombinedArray(loopcounter, 1) = ColAArray(loopcounter, 1) 'Actual value from Col A
CombinedArray(loopcounter, 2) = "A" & DataRowNumber 'Cell address of that value
DataRowNumber = DataRowNumber + 1 ' Increment the row number
Next loopcounter
'reset the data row back to 1
DataRowNumber = 1
'Now add the colb array to the combined array, starting from the element after the end of the first array
For loopcounter = UBound(ColAArray, 1) + 1 To UBound(CombinedArray, 1)
CombinedArray(loopcounter, 1) = ColBArray(DataRowNumber, 1) 'Actual value from Col A
CombinedArray(loopcounter, 2) = "B" & DataRowNumber 'Cell address of that value
DataRowNumber = DataRowNumber + 1 ' Increment the row number
Next loopcounter
Call QuickSortArray(CombinedArray, , , 1) 'Sort the array by cell values, so any dupes will be next to each other
'Using variables for these values just for readability
FirstRowInArray = LBound(CombinedArray, 1)
LastRowInArray = UBound(CombinedArray, 1)
'Initialise the duplicate counter
DupesCounter = 0
'The array is sorted, so loop through the array looking for duplicate values next to each other
For loopcounter = LBound(CombinedArray, 1) To UBound(CombinedArray, 1)
Select Case loopcounter
Case LastRowInArray 'Deal with the last row - We cant check the next row as would be out of bound
If CombinedArray(loopcounter, 1) = CombinedArray(loopcounter - 1, 1) Then
DupesCounter = DupesCounter + 1
'Add an entry to the error array
DupesArray(DupesCounter, 1) = CombinedArray(loopcounter, 1) 'Duplicated value
DupesArray(DupesCounter, 2) = CombinedArray(loopcounter, 2) 'Cell address
Exit For 'last element so exit
End If
'Deal with first - we cant check the preceeding rows
Case FirstRowInArray
If CombinedArray(loopcounter, 1) = CombinedArray(loopcounter + 1, 1) Then
DupesCounter = DupesCounter + 1
'Add an entry to the error array
DupesArray(DupesCounter, 1) = CombinedArray(loopcounter, 1) 'Duplicated value
DupesArray(DupesCounter, 2) = CombinedArray(loopcounter, 2) 'Cell address
End If
Case Else 'Deal with all other rows - we need to check if the preceeding or the following row is a duplicate
If CombinedArray(loopcounter, 1) = CombinedArray(loopcounter - 1, 1) Or _
CombinedArray(loopcounter, 1) = CombinedArray(loopcounter + 1, 1) Then
DupesCounter = DupesCounter + 1
'Add an entry to the error array
DupesArray(DupesCounter, 1) = CombinedArray(loopcounter, 1) 'Duplicated value
DupesArray(DupesCounter, 2) = CombinedArray(loopcounter, 2) 'Cell address
End If
End Select
Next loopcounter
'Now process the duplicates. You could do anythin here with this data, report to text file etc
'I have just chosen to colour the cells
If DupesCounter > 0 Then
For loopcounter = 1 To DupesCounter
ActiveSheet.Range(DupesArray(loopcounter, 2)).Interior.Color = RGB(255, 255, 0)
Next loopcounter
MsgBox DupesCounter & " duplicates were found and highlighted when comparing values between column A and B", vbInformation, "Results of comparing Column A values with Column B"
Else
MsgBox "No Duplicated values between column A and B", vbInformation, "Results of comparing Column A values with Column B"
End If
End Sub
'---------------------------------------------------------------------------------------
' Procedure : QuickSortArray
' Purpose : This procedure is used to sort multi dimensional arrays - It is primarily used to sort arrays
' from sheets where the original sheet sort order wants to be left "as is", so the data is read into
' an array and then sorted. This code was taken from the internet, with comments/credits as shown below
' source url: http://stackoverflow.com/questions/152319/vba-array-sort-function
' Sort a 2-Dimensional array
' SampleUsage: sort arrData by the contents of column 3 ' ' QuickSortArray arrData, , , 3
' Posted by Jim Rech 10/20/98 Excel.Programming
' Modifications, Nigel Heffernan:
' Escape failed comparison with empty variant
' Defensive coding: check inputs
' Version : 06/01/2014 : Mark Moore - Initial Version
'---------------------------------------------------------------------------------------
Public Sub QuickSortArray(ByRef SortArray As Variant, _
Optional lngMin As Long = -1, _
Optional lngMax As Long = -1, _
Optional lngColumn As Long = 0)
Dim i As Long
Dim j As Long
Dim varMid As Variant
Dim arrRowTemp As Variant
Dim lngColTemp As Long
On Error Resume Next
If IsEmpty(SortArray) Then
Exit Sub
End If
If InStr(TypeName(SortArray), "()") < 1 Then 'IsArray() is somewhat broken: Look for brackets in the type name
Exit Sub
End If
If lngMin = -1 Then
lngMin = LBound(SortArray, 1)
End If
If lngMax = -1 Then
lngMax = UBound(SortArray, 1)
End If
If lngMin >= lngMax Then ' no sorting required
Exit Sub
End If
i = lngMin
j = lngMax
varMid = Empty
varMid = SortArray((lngMin + lngMax) \ 2, lngColumn)
' We send 'Empty' and invalid data items to the end of the list:
If IsObject(varMid) Then ' note that we don't check isObject(SortArray(n)) - varMid might pick up a valid default member or property
i = lngMax
j = lngMin
ElseIf IsEmpty(varMid) Then
i = lngMax
j = lngMin
ElseIf IsNull(varMid) Then
i = lngMax
j = lngMin
ElseIf varMid = "" Then
i = lngMax
j = lngMin
ElseIf VarType(varMid) = vbError Then
i = lngMax
j = lngMin
ElseIf VarType(varMid) > 17 Then
i = lngMax
j = lngMin
End If
While i <= j
While SortArray(i, lngColumn) < varMid And i < lngMax
i = i + 1
Wend
While varMid < SortArray(j, lngColumn) And j > lngMin
j = j - 1
Wend
If i <= j Then
' Swap the rows
ReDim arrRowTemp(LBound(SortArray, 2) To UBound(SortArray, 2))
For lngColTemp = LBound(SortArray, 2) To UBound(SortArray, 2)
arrRowTemp(lngColTemp) = SortArray(i, lngColTemp)
SortArray(i, lngColTemp) = SortArray(j, lngColTemp)
SortArray(j, lngColTemp) = arrRowTemp(lngColTemp)
Next lngColTemp
Erase arrRowTemp
i = i + 1
j = j - 1
End If
Wend
If (lngMin < j) Then Call QuickSortArray(SortArray, lngMin, j, lngColumn)
If (i < lngMax) Then Call QuickSortArray(SortArray, i, lngMax, lngColumn)
End Sub

Mark Moore
- 520
- 4
- 13