12

I have an Excel database and I'm trying avoid doing some manual combining of duplicate data. I've got a bunch of listings that are essentially the same aside from the tags column. What I'd like to have it do is combine these 5 listings into 1 listing, making the categories a comma separated list in a single cell.

Turn this

Data Example

into this

Desired result

Is there any way of achieving this? My document has a couple thousand listings, so I'm obviously trying to avoid the manual edit route. I'm an Excel novice, so any hand holding or tutorials you could point me to would be appreciated.

Community
  • 1
  • 1
jbwharris
  • 710
  • 1
  • 10
  • 30
  • super easy if your list is already sorted by the first column (e.g., so all "Canadian Tire Associate Stores" are grouped together) and only marginally more difficult if they're not sorted. Store the listings in a 2d array, and just build out the comma-separated list in the second column of the array. When finished, write out to a new sheet/etc. – David Zemens Apr 02 '13 at 01:41
  • @DavidZemens I'm not sure how to go about doing that? Can that be done through Excel, or is that done programatically? I'm a designer trying to fudge my way through the data management issues I'm running into, so I'm really not sure how to go about doing that. – jbwharris Apr 02 '13 at 02:35
  • Yes, this can be done using VBA (Visual Basic for Applications) which is probably the easiest way to programmatically "communicate" with Excel. Nick's suggested answer, below, should be more or less functionally equivalent to what I described and is another good method for this type of problem (earlier today I suggested [this sort of solution](http://stackoverflow.com/questions/15748138/convert-list-of-items-in-an-excel-table-to-comma-separated-string/15749038#15749038) on another question). – David Zemens Apr 02 '13 at 02:51

2 Answers2

31

This can also be done using formulas. For my example to work, the data would need to be sorted by the first column and there would need to be a header row.

You would need two more columns (C & D). First, add a formula that essentially says to concatenate the data in column B if data in column A is the same as the row above it, otherwise reset the concatenation. The next column would contain a formula to identify the final concatenations so you can sort later.

This is how I would do it with listings and categories in columns A & B (again, the data would need to be sorted by column A and there would need to be a header row): enter image description here

Here's the results. Now I would copy the entire range and paste values into another sheet. The rows with zero for column D is what I'd want to use. Sorting by column D would float them to the top. enter image description here

Dave Thornton
  • 581
  • 3
  • 7
  • Okay, this looks closest to what I'm trying to do and aligns with my knowledge of Excel. I tried modifying the formula to =IF(A2=A1,L1&", "&I2,I2+L3) and it's telling me I have a circular reference. I have bunch more rows of data, but they are all duplicate data except the category column. Would that mess things up? – jbwharris Apr 02 '13 at 15:04
  • Let me try to break down part of the if statement better. For true: concatenate the cell directly above (the cumulative concatenation) with the category in the current row (I assume I2). For false: no concatenation, just reference the category in the current row (I2). I think L3 is causing the circular reference. Does this formula work? =IF(A2=A1,L1&", "&I2,I2) – Dave Thornton Apr 02 '13 at 15:40
  • Perfect, that's doing exactly what I need it to do. Thanks so much! – jbwharris Apr 02 '13 at 17:12
3

This will (should) generate a new sheet from your source sheet with the duplicates concatenated.

To use the following code you need to add it to a new module in the VBA Editor

A Shortcut to open the VBA Editor is Alt+F11 (for Windows) and Alt+Fn+F11 (for Mac)

Once the Editor is open add a new module by selecting it from the "insert" menu in the main menu bar. It should automatically open the module ready to accept code, If not you need to select it (will be named "ModuleN" where N is the next available number) from the project explorer.

I'm not sure if the "Scripting.Dictionary" is available in osx, but it cant hurt to try.

Option Explicit

Sub Main()
Dim Source As Worksheet: Set Source = ThisWorkbook.Worksheets("Sheet1")
Dim Destination As Worksheet: Set Destination = ThisWorkbook.Worksheets("Sheet2")

Dim Records As Object: Set Records = CreateObject("Scripting.Dictionary")

Dim Data As Variant
Dim Index As Long
Dim Row As Integer: Row = 1

Data = Source.Range("A1", "B" & Source.Rows(Source.UsedRange.Rows.Count).Row).Value2

For Index = LBound(Data, 1) To UBound(Data, 1)
    If Records.Exists(Data(Index, 1)) Then
        Destination.Cells(Records(Data(Index, 1)), 2).Value2 = Destination.Cells(Records(Data(Index, 1)), 2).Value2 & ", " & Data(Index, 2)
    Else
        Records.Add Data(Index, 1), Row
        Destination.Cells(Row, 1).Value2 = Data(Index, 1)
        Destination.Cells(Row, 2).Value2 = Data(Index, 2)
        Row = Row + 1
    End If
Next Index

Set Records = Nothing

End Sub
NickSlash
  • 4,758
  • 3
  • 21
  • 38
  • As I mentioned, I'm no ace with Excel. Where do I go to run that code? – jbwharris Apr 02 '13 at 01:56
  • I just checked and there wasn't a Module option under Option. I'll give it a try at work tomorrow where I have Office 2007. Thanks. – jbwharris Apr 02 '13 at 02:38
  • Found this while doing a search and confirm that this works for what I needed. Much easier to use then a formula if you know how to insert into a module and adjust the sheet name. – Peter Nguyen Jan 05 '21 at 17:30