1

I have a large table with lots of data, but what I'm looking at is six columns of that table - the names of people who went on a particular job together. Something like this:

+-------+--------+--------+-------+--------+-------+
| Name1 | Name2  | Name3  | Name4 | Name5  | Name6 |
+-------+--------+--------+-------+--------+-------+
| Rod   | Jane   |        |       |        |       |
| Jane  | Freddy | Peter  | Paul  |        |       |
| Paul  |        |        |       |        |       |
| Mary  | Jane   | Rod    | Peter | Freddy | Paul  |
| Paul  | Rod    | Freddy |       |        |       |
+-------+--------+--------+-------+--------+-------+

And what I want to end up with is this (on a different sheet):

+--------+
|  Name  |
+--------+
| Rod    |
| Jane   |
| Freddy |
| Peter  |
| Paul   |
| Mary   |
+--------+

I want to be able to identify all the unique entries from those six columns, and then have them populate onto a different sheet. My first thought was to do it with formulae, and that worked (I used INDEX MATCH with a COUNTIF in the MATCH section), but there are 11000ish records in the table and 1200ish different names that could potentially be involved, and it was taking most of the day to process. I figured, hoped, that using VBA would make it run more quickly.

I've looked at a number of possible answers. First, I went here: Populate unique values into a VBA array from Excel , and looked at brettdj's answer (because I kind of understood where it was going), ending up with the following code:

Dim X
Dim objDict As Object
Dim lngRow As Long

Sheets("Data").Select
Set objDict = CreateObject("Scripting.Dictionary")
X = Application.Transpose(Range([be2], Cells(Rows.Count, "BE").End(xlUp)))

For lngRow = 1 To UBound(X, 1)
objDict(X(lngRow)) = 1
Next

Sheets("Crew").Select

Range("A2:A" & objDict.Count) = Application.Transpose(objDict.keys)
End Sub

which worked beautifully, for one column (BE is the Name1 column in the table above - Data is the sheet where the data is stored, Crew is the sheet where I want the unique values to go). But I couldn't for the life of me figure out how to make it take values from multiple columns (BE to BJ).

I then tried this, derived from Jeremy Thompson's answer in Quicker way to get all unique values of a column in VBA? :

Sheets("Data").Select

Range("BE:BJ").AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Sheets("Crew").Range("A:A"), Unique:=True

But again, I couldn't make it put the information from multiple columns into one. Third attempt, I looked at Gary's Student's answer from How to extract unique values from two columns Excel VBA and tried this:

Dim Na As Long, Nc As Long, Ne As Long
Dim i As Long
Na = Sheets("Stroke Data").Cells(Rows.Count, "BE").End(xlUp).Row
Nc = Sheets("Stroke Data").Cells(Rows.Count, "BF").End(xlUp).Row
Ne = 1

For i = 1 To Na
    Cells(Ne, "E").Value = Cells(i, "A").Value
    Ne = Ne + 1
Next i
For i = 1 To Na
    Cells(Ne, "E").Value = Cells(i, "C").Value
    Ne = Ne + 1
Next i

Sheets("Fail").Range("A:A").RemoveDuplicates Columns:=1, Header:=xlNo

(only tried two columns in that one, to see if I could figure it out that way, but no)

I'm really at a loss. As you can probably see from the above, I'm flailing around wildly, and have tried to approach this from three different angles and achieved absolutely nothing. I feel there must be a way to make the first one work, if nothing else, because it nearly worked. But I don't get it.

I suppose that I could run it for four separate columns, and then have a process which combined the four into one. But even then, I'm not sure how I'd remove the duplicates which would result (as you can see in the table above, names can appear in any column).

As long as I can end up with one column with a list of unique names, and it doesn't take hours to process, I suppose I don't really mind how I get there.

TylerH
  • 20,799
  • 66
  • 75
  • 101
RolloTreadway
  • 33
  • 1
  • 5
  • Taking a step back - in your data at the top, you have six columns. Are you basically trying to pull out all unique names from the columns? I can't quite tell how you end up with your expected results. – BruceWayne Jan 08 '19 at 16:34
  • Yes. I want to pull out all the unique values from those six columns. Apologies for not making that clear. – RolloTreadway Jan 08 '19 at 16:37

4 Answers4

1

This will prompt you to select a range (can select a non-contiguous range by holding CTRL), and then will extract unique values from the selected range and output the results on a new sheet:

Sub tgr()

    Dim wb As Workbook
    Dim wsDest As Worksheet
    Dim rData As Range
    Dim rArea As Range
    Dim aData As Variant
    Dim i As Long, j As Long
    Dim hUnq As Object

    'Prompt to select range.  Uniques will be extracted from the range selected.
    'Can select a non-contiguous range by holding CTRL
    On Error Resume Next
    Set rData = Application.InputBox("Select range of names where unique names will be extracted:", "Data Selection", Selection.Address, Type:=8)
    On Error GoTo 0
    If rData Is Nothing Then Exit Sub   'Pressed cancel

    Set hUnq = CreateObject("Scripting.Dictionary")
    For Each rArea In rData.Areas
        If rArea.Cells.Count = 1 Then
            ReDim aData(1 To 1, 1 To 1)
            aData(1, 1) = rArea.Value
        Else
            aData = rArea.Value
        End If

        For i = 1 To UBound(aData, 1)
            For j = 1 To UBound(aData, 2)
                If Not hUnq.Exists(aData(i, j)) And Len(Trim(aData(i, j))) > 0 Then hUnq(Trim(aData(i, j))) = Trim(aData(i, j))
            Next j
        Next i
    Next rArea

    Set wb = rData.Parent.Parent    'First parent is the range's worksheet, second parent is the worksheet's workbook
    Set wsDest = wb.Sheets.Add(After:=wb.Sheets(wb.Sheets.Count))
    wsDest.Range("A1").Resize(hUnq.Count).Value = Application.Transpose(hUnq.Items)

End Sub
tigeravatar
  • 26,199
  • 5
  • 30
  • 38
0

This is a little verbose, but worked for me with your sample data. (May need to tweak how the initial rng is set).

Sub unique_names()
Dim rng As Range
Set rng = ActiveSheet.UsedRange

Dim col As Range, cel As Range
Dim names() As Variant
ReDim names(rng.Cells.Count)

Dim i As Long
i = 0
'First, let's add all the names to the array
For Each col In rng.Columns
    For Each cel In col.Cells
        If cel.Value <> "" Then
            names(i) = cel.Value
            i = i + 1
        End If
    Next cel
Next col

' Now, extract unique names from the array
Dim arr As New Collection, a
Set arr = unique_values(names)
For i = 1 To arr.Count
   Worksheets("Sheet1").Cells(i, 10) = arr(i)
Next

End Sub
Private Function unique_values(iArr As Variant) As Collection
' https://stackoverflow.com/a/3017973/4650297
Dim arr As New Collection, a
On Error Resume Next
  For Each a In iArr
     arr.Add a, a
  Next

Set unique_values = arr

End Function
BruceWayne
  • 22,923
  • 15
  • 65
  • 110
  • This worked very well - thank you!. For future reference - in case I try to do this again, which I'll probably have to - is there anything I can add to limit it to certain columns within the spreadsheet (for instance, if I just wanted to use two of the columns)? (I feel certain I should know the answer to this but I've been at work a loooong time.) – RolloTreadway Jan 08 '19 at 17:05
  • @RolloTreadway you can do this a bunch of ways. A quick one is if you know which two you could do `Set rng = Range("D1:E100")`. Or let the user pick, look up "VBA let user select range" or something like that. – BruceWayne Jan 08 '19 at 17:31
  • Thanks! (And apologies for slow response.) – RolloTreadway Jan 11 '19 at 11:35
0

Here's an approach using a dictionary. Just specify the range you want to search, and the RangeToDictionary function should do the rest. I'm assuming you don't want to include blanks, so I removed those.

Private Function RangeToDictionary(MyRange As Range) As Object
    If MyRange Is Nothing Then Exit Function
    If MyRange.Cells.Count < 1 Then Exit Function

    Dim cell  As Range
    Dim dict  As Object: Set dict = CreateObject("Scripting.Dictionary")

    For Each cell In MyRange
        If Not dict.exists(Trim$(cell.Value2)) And Trim$(cell.Value2) <> vbNullString Then dict.Add cell.Value2, cell.Value2
    Next

    Set RangeToDictionary = dict
End Function

Sub Example()
    Dim dict       As Object
    Dim rng        As Range:Set rng = ThisWorkbook.Sheets("Sheet1").Range("A1:f5")
    Dim outsheet   As Worksheet:Set outsheet = ThisWorkbook.Sheets("Sheet2")

    Set dict = RangeToDictionary(rng)

    outsheet.Range(outsheet.Cells(1, 1), outsheet.Cells(dict.Count, 1)) = Application.Transpose(dict.items())
End Sub
Ryan Wildry
  • 5,612
  • 1
  • 15
  • 35
  • This is also very useful, thanks. Something confuses me though - I blame being tired, but I put in the wrong range, and when I went back and tried to run it again I get an error: 'the key is already associated with an element of this collection' on the 'Then dict.Add cell.Value2, cell.Value2' part. Do you have any idea why? I'm sure it's me getting something wrong. – RolloTreadway Jan 08 '19 at 17:10
  • Shouldn't happen with the code posted. The `Not dict.Exists` portion should handle that. – Ryan Wildry Jan 08 '19 at 17:19
0

Assuming you have Excel 2016 and up, you can do this with Power Query. Convert your data range to a table, select a cell within the table, select "From Table" in Data > Get & Transform and then paste the following code in the Advanced Editor of the Power Query Editor (changing Table3 to whatever your table name ends up being).

let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Name1", type text}, {"Name2", type text}, {"Name3", type text}, {"Name4", type text}, {"Name5", type text}, {"Name6", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type"," ","",Replacer.ReplaceText,{"Name1", "Name2", "Name3", "Name4", "Name5", "Name6"}),
    #"Added Custom" = Table.AddColumn(#"Replaced Value", "Text.Combine", each Text.Combine({[#"Name1"],[#"Name2"],[#"Name3"],[#"Name4"],[#"Name5"],[#"Name6"]},";")),
    #"Replaced Value1" = Table.ReplaceValue(#"Added Custom",";;","",Replacer.ReplaceText,{"Text.Combine"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value1", {{"Text.Combine", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Text.Combine"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Text.Combine", type text}}),
    #"Removed Duplicates" = Table.Distinct(#"Changed Type1", {"Text.Combine"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each ([Text.Combine] <> "")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Name1", "Name2", "Name3", "Name4", "Name5", "Name6"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Text.Combine", "UniqueList"}})
in
    #"Renamed Columns"
MBB70
  • 375
  • 2
  • 16
  • I just realized that removing the consecutive semi-colons is not necessary as the blanks are filtered out later...just right-click delete in the PQE and the code will auto-adjust. – MBB70 Jan 08 '19 at 22:12
  • Unfortunately, I'm stuck with 2013, but thanks anyway! – RolloTreadway Jan 11 '19 at 12:39
  • It's an add-in for 2013. If you do a lot of work like this, I would give Power Query a serious look. Oftentimes, it is much more efficient than setting up worksheet formulae and looping through your data with VBA. You can also automate the creation of your queries with VBA. The only drawback I can think of is that some IT managers detest Excel add-ins because they sometimes create snafus. You'll have a serious argument for upgrading Office once you demonstrate how useful this tool can be though. – MBB70 Jan 11 '19 at 17:37