-2

Well the question is, I have got a column, for example column Y has many entries in it, nearly 40,000 and It increases everyweek. The thing is I have to check for duplicates in Y column and delete the entire row. Thus, Y column should have only unique entries.

Suppose I have 3,000 entries and after 1 week, i'll have about 3,500 entries. Now I have to check these newly added 500 columnn values not the 3,500 with the old + the new i.e 3,500 entries and delete the duplicated row. The old 3,000 shouldn't be deleted or changed. I have found macros but they do the trick for the entire column. I would like to filter the new 500 values.

 Cells(2, "Q").Formula = "=COUNTIF(P$1:P1,P2)=0"   'I have used these formula 
 Range("Q2").Copy Destination:=Range("Q3:Q40109")  'it gives false for the duplicate values

I know we have to use countif for the duplicate entries. But what Iam doing is applying the formula and then search for false entries and then delete it. I belive applying formula and finding false and then deleting its little bit time consuming.

Sub DeleteDups() 
Dim x               As Long 
Dim LastRow         As Long 
LastRow = Range("A65536").End(xlUp).Row 
For x = LastRow To 1 Step -1 
    If Application.WorksheetFunction.CountIf(Range("A1:A" & x), Range("A" & x).Text) > 1 Then 
        Range("A" & x).EntireRow.Delete 
    End If 
Next x   
End Sub 

This is what I found on google but i dont know where the error is. It is deleting all the columns if i set

For x = LastRow To 1 Step -1 
For x = LastRow to step 3000 ' It is deleting all 500 columns but if it is -1 working fine

Any modifications need to be done for these function? or sugest me any good function that helps me. Check for the duplicate values of a selected column range from the entire column. I mean check 500 entires column values with the 3500 column entry values and delete the duplicates in 500 entries

Thanks in advance

niko
  • 9,285
  • 27
  • 84
  • 131
  • 4
    Ever wonder why everyone gives you irrelevant answers? Is it possible that they are being misled by your question because it is unclear and confusing? Instead of threatening everyone with "-1", why don't you make your question clearer and more concise, and thereby help those trying to help you. – Jean-François Corbett Aug 24 '11 at 13:08

3 Answers3

3

This should be rather simple. You need to create 1 cell somewhere in your file that you will write the cell count for column Y each week after removing all dupes.

For example, say week1 you remove dupes and you are left with a range of Y1:Y100. Your function will put "100" somewhere in your file to reference.

Next week, your function will start looking from dupes from (cell with ref number) + 1, so Y:101 to end of column. After removing dupes, the function changes the ref cell to the new count.

Here is the code:

Sub RemoveNewDupes()

'Initialize for first time running this
If Len(Range("A1").Value) = 0 Then
    Range("A1").Value = Range("Y" & Rows.count).End(xlUp).row
End If

If Range("A1").Value = 1 Then Range("A1").Value = 0

'Goodbye dupes!
ActiveSheet.Range("Y" & Range("A1").Value + 1 & ":Y" & _
Range("Y" & Rows.count).End(xlUp).row).RemoveDuplicates Columns:=1, Header:=xlNo

'Re-initialize the count for next time
Range("A1").Value = Range("Y" & Rows.count).End(xlUp).row

End Sub

*sorry no idea why auto-syntax highlighting makes this hard to read

Update:

Here is a way to do it in Excel 2003. The trick is to loop backwards through the column so that the loop isn't destroyed when you delete a row. I use a dictionary (which I'm famous for over-using) since it allows you to check easily for dupes.

Sub RemoveNewDupes()

Dim lastRow As Long
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")

If Len(Range("A1").Value) = 0 Then
    Range("A1").Value = 1
End If

lastRow = Range("Y" & Rows.count).End(xlUp).row

On Error Resume Next
For i = lastRow To Range("A1").Value Step -1
    If dict.exists(Range("Y" & i).Value) = True Then
        Range("Y" & i).EntireRow.Delete
    End If
    dict.Add Range("Y" & i).Value, 1
Next

Range("A1").Value = Range("Y" & Rows.count).End(xlUp).row

End Sub
Gaijinhunter
  • 14,587
  • 4
  • 51
  • 57
  • My question was not that Please read it carefully. I didnt say storing the number I just want to delete the duplicate columns of the selected range from the entire range got it – niko Aug 24 '11 at 12:58
  • Perhaps you are mistaking your terms then, the question is quite clear. Check out my solution. It will not touch anything in column Y that was there the last time you ran the function, it will only look at the range of new cells since it was last run and remove duplicates found within that range. – Gaijinhunter Aug 24 '11 at 13:29
  • I guess u have shown me the excel 2007 vba I updated my Title – niko Aug 24 '11 at 13:33
2

How can Excel know that entries are "new"? (e.g. how can we know we only have to consider the 500 last rows)
Actually, if you already executed the macro last week, the first 3,000 rows won't have any duplicates so the current execution won't change these rows.

The code your described should nearly work. If we keep it and change it very slightly:

Sub DeleteDups() 
Dim x               As Long 
Dim LastRow         As Long 
LastRow = Range("Q65536").End(xlUp).Row 
For x = LastRow To 1 Step -1
    'parse every cell from the bottom to the top (to still count duplicates)
    '  and check if duplicates thanks to the formula 
    If Range("Q" & x).Value Then Range("Q" & x).EntireRow.Delete 
Next x   
End Sub

[EDIT] Another (probably faster) solution: filter first the values and then delete the visible rows:

Sub DeleteDups() 
ActiveSheet.UsedRange.AutoFilter Field:=17, Criteria1:="True" 'filter column Q for True values
ActiveSheet.Cells.SpecialCells(xlCellTypeVisible).EntireRow.Delete
End Sub

Couldn't test this last solution right here, sorry.

JMax
  • 26,109
  • 12
  • 69
  • 88
  • Guys i said not entire column only 500 column values You people posting to -1 – niko Aug 24 '11 at 11:49
  • 1
    @niko: How can Excel know that entries are "new"? (e.g. how can we know we only have to consider the 500 last rows) Actually, if you already executed the macro last week, the first 3,000 rows won't have any duplicates so the current execution won't change these rows. – JMax Aug 24 '11 at 11:59
0

Here's an idea:

Sub test
LastRow = Range("A65536").End(xlUp).Row
For i = LastRow To 1 Step -1
  If Not Range("a1:a" & whateverLastRowYouWantToUse ).Find(Range("a" & i).Value, , , , , xlPrevious) Is Nothing Then
    Rows(i).Delete
  End If
Next i
End Sub

It checks the entire range above the current cell for a single duplicate. If found, it the current row is deleted.

EDIT I just realized in your example, you said column Y, but in your code you are checking A. Not sure if the example was just a hypothetical, but wanted to make sure that wasn't the reason for the odd behavior.

Note, this is untested! Please save your workbook before trying this!

Justin Self
  • 6,137
  • 3
  • 33
  • 48
  • Not really the answer I needed – niko Aug 24 '11 at 11:42
  • Read my question again the code You have posted I already know it buddy I need to test only 500 entries not all the entries not to -1 – niko Aug 24 '11 at 11:46
  • I edited the answer so you can see how to modify it only to check in a certain range. So From LastRow to 1 (change 1 to the beginning of the new entires), check for duplicates in A1:A whatever (change whatever to the end of the older entries. – Justin Self Aug 24 '11 at 14:40