0

I currently have an Excel data set showing products sold at various times and how the sales person designated the product class of the item sold.

Product#    Product Class

10001       Hardware
20002       Software
30003       Misc.
10001       Hardware
10001       Software
20002       Software
10001       Hardware
30003       Misc.

You'll notice that for Product# 10001 (which should be classified as "Hardware"), a sales person mistakenly designated one of the sales as "Software."

Thus, if I filter by Product# 10001, I will get 2 different Product Classes. I am trying to find out which product# has more than 1 Product Classes for about 30,000 different products so that they can be fixed.

How can I go through the data to indicate which product numbers need to be fixed to only have 1 Product Class?

I am expecting only a handful of cases where this multiple class-designation problem to exist, so once I can find out which product # has the issue, it can be fixed manually.

Thanks!

Community
  • 1
  • 1
djkim0403
  • 3
  • 2

3 Answers3

1

You can use this formula =IF(COUNTIF($A$3:$A$1000,A3)-COUNTIFS($A$3:$A$1000,A3,$B$3:$B$1000,B3)>0,"Wrong","OK") in cell C3 . Adjust range to your needs and drag formula to the bottom.

enter image description here

Kresimir L.
  • 2,301
  • 2
  • 10
  • 22
1

You can use an object known as a dictionary. A dictionary has a key and a corresponding value. The way it stores the key is implemented so that it is quick to find out if a particular key exists in the dictionary, which will be helpful in checking for duplicate product codes.

For simplicity (and because I have little experience of VBA in Excel), I have assumed the data to check is in columns 1 and 2 of Sheet(1):

Option Explicit

' Add reference to get Dictionary: http://www.techbookreport.com/tutorials/vba_dictionary.html
' Excel VBA- Finding the last column with data: https://stackoverflow.com/a/11927387

Sub FindDuplicates()
    Dim dict As Dictionary
    Set dict = New Dictionary

    Dim ws As Worksheet
    Set ws = Sheets(1)

    Dim rLastCell As Range
    Set rLastCell = ws.Cells.Find(What:="*", After:=ws.Cells(1, 1), LookIn:=xlFormulas, LookAt:= _
                                  xlPart, SearchOrder:=xlByRows, SearchDirection:=xlPrevious, MatchCase:=False)

    Dim key As String
    Dim val As String
    Dim dupes As String
    Dim i As Long
    ' use maxFinds to limit the number of duplicates found
    Dim maxFinds As Integer
    maxFinds = 32
    Dim nFound As Integer
    nFound = 0

    For i = 1 To rLastCell.Row
        key = Cells(i, 1).Value
        val = Cells(i, 2).Value
        If dict.Exists(key) Then
            If dict(key) <> val Then
                dupes = dupes & "Row: " & i & " Class: " & val & vbCrLf
                nFound = nFound + 1
            End If
        Else
            dict.Add key, val
        End If
        If nFound = maxFinds Then
            Exit For
        End If
    Next

    If nFound = 0 Then
        MsgBox ("No duplicates found.")
    Else
        MsgBox (dupes)
    End If

End Sub

The dictionary object isn't built-in to Excel, so you need to add a reference to Microsoft Scripting Runtime through "Tools" menu -> "References...".

I had created a test file with 50,000 rows and quite a lot of duplicates which is why the code ended up with a facility to limit the number of duplicates found, so you could set maxFinds = 10 and go through those, then run the macro again to find another ten, and so on. Also, if there are many more than 32-ish then they don't fit in the message box anyway.

It assumes that the first occurence of a "Product Class" (or the value in column 2 in the above code example) is the correct one.

Example output:

enter image description here

Andrew Morton
  • 24,203
  • 9
  • 60
  • 84
0

You can use this array formula (click Ctrl + Shift + Enter together) to find out which one needs to be taken care of:

=IF(IFERROR(VLOOKUP(A2,$A$1:A1,1,0),0)>0,IF(ISTEXT(VLOOKUP(A2&B2,$A$1:A1&$B$1:B1,1,0)),"","Dup"),"")

Formula from cell C2 and drag down.

Then just do a filter with the wording Dup. This should work for you.

ian0411
  • 4,115
  • 3
  • 25
  • 33
  • Hey Thanks for the response! However, I am not trying to find duplicates, but rather only the instances where a product number has more than one product class designations, and then ignore all other cases. In your example, product 30003 has "2" duplicates, but it has only 1 product class so it should be ignored – djkim0403 Oct 06 '17 at 17:12
  • I just revised my answer and hopefully this is what you are looking for. – ian0411 Oct 06 '17 at 17:52