-1

I need to make a macro to compare two columns looking for duplicate cells.

I'm currently using this simple double for loop algorithm

for i = 0 To ColumnASize
   Cell1 = Sheet.getCellByPosition(0,i)

   for j = 0 to ColumnBSi
   Cell2 = Sheet.getCellByPosition(1,j)

   ' Comparison happens here

   Next j
Next i

However, as I have 1000+ items in each column this algorithm is quite slow and inefficient. Does anyone here know/have an idea for a more efficient way to do this?

pnuts
  • 58,317
  • 11
  • 87
  • 139
  • Are the rows ordered in some way? – MrTux Sep 07 '14 at 22:37
  • Need a bit more context around what's happening in the comparison. Can you post a bit more code? – Gareth Sep 07 '14 at 23:10
  • No they are not. I am just comparing two strings (can be anything) – user3299369 Sep 07 '14 at 23:11
  • Take a string from column A, then search all of Column B to see if there is an identical string – user3299369 Sep 07 '14 at 23:12
  • Use VBA arrays instead of referencing the worksheet cells: arrColA = "ColA range"; arrColB = "ColB range". Then loop through the first array and use the Match worksheet function on the 2nd (or loop through the 2nd doing the comparison). – Ron Rosenfeld Sep 08 '14 at 00:18

1 Answers1

0

If you want to ensure that no string in col A is equal to any string in col B, then your existing algorithm is order n^2. You may be able to improve that by the following:

1) Sort col A or a copy of it (order nlogn)

2)Sort col B or a copy of it (order nlogn)

3) Look for duplicates by list traversal, see this previous answer (order n).

That should give you an order nlogn solution and I don't think you can do much better than that.

Community
  • 1
  • 1
Penguino
  • 2,136
  • 1
  • 14
  • 21