1

Complicated problem... let me explain first, maybe there is a better solution rather than using iterative calculations:

(Link to Workbook)

Image showing example (to show what I'm working with)

PROBLEM:

Have 4,000+ Strings and would like to categorize them into pre-determined Groups (based on the String's content).

  1. Each String should only be assigned to one Group. (ie. "55 gallon drum faucet" will be listed under "faucet" column, since it contains the word "faucet".)

  2. Once categorized into a Group, the string won't be categorized under any other Groups. (ie. "55 gallon drum faucet" won't be categorized under "drum" once it's already been categorized under "faucet").

  3. It really doesn't matter which Group each String does under, as long as it's categorized.

Note: (I've almost found a solution using iterative calculation, but it doesn't quite work).

SOLUTION:

The way I approached the problem was:

  1. Count the number of times the String (Column A) was duplicated in the worksheet using the formula:

     Formula: =COUNTIF($E$2:$IA$10000,A3)
    
    • This formula goes in Column C.
  2. Created a formula that will categorize a String underneath a Group based on whether the String contains the Group word (ie. "faucet", "beer", "gallon", "kitchen", etc)... AND has not been used before (ie. Column C, which contains the formula from above).

      Formula: =IF(C3<1,IF(IFERROR(SEARCH("faucet",A3),0)>0,A3,""),"")
    
  3. Drag formula down for all 4,000 strings in Column C, and for each individual "Group" column.

The problem with this approach is that it will do an iterative calculation which will either:

  1. Categorize the String under a Group (but WON'T increase the Times Dup'd field from 0 to 1)...

OR

  1. Increase the "Times Dup'd" field from 0 to 1 BUT will keep the String from being categorized under the Groups column.

Any suggestions on how to address the iterative calculation issue? (I know it keeps calculating back and forth since it's dependent, so will have to settle with 1 "right" solution... I'm wondering if there's any way to create some sort of 'block' so it can only be calculated one way...)

Any help would be greatly appreciated!

sikorloa
  • 101
  • 13

2 Answers2

2

Run this procedure through your data. It performs all processing within a pair of variant arrays.

Sub byGroup()
    Dim g As Long, s As Long, aSTRs As Variant, aGRPs As Variant

    appTGGL bTGGL:=False

    With Worksheets("Sheet1")
        aSTRs = .Range(.Cells(2, 1), .Cells(Rows.Count, 1).End(xlUp)).Value2
        With .Range(.Cells(1, 5), .Cells(Rows.Count, 1).End(xlUp).Offset(0, Application.Match("zzz", .Rows(1)) - 1))
            .Resize(.Rows.Count, .Columns.Count).Offset(1, 0).ClearContents
            aGRPs = .Cells.Value2
        End With

        For s = LBound(aSTRs, 1) To UBound(aSTRs, 1)
            For g = LBound(aGRPs, 2) To UBound(aGRPs, 2)
                If CBool(InStr(1, aSTRs(s, 1), aGRPs(1, g), vbTextCompare)) Then
                    aGRPs(s + 1, g) = aSTRs(s, 1)
                    Exit For
                End If
            Next g
        Next s

        .Cells(1, 5).Resize(UBound(aGRPs, 1), UBound(aGRPs, 2)) = aGRPs

    End With

    appTGGL
End Sub

Public Sub appTGGL(Optional bTGGL As Boolean = True)
    Debug.Print Timer
    Application.ScreenUpdating = bTGGL
    Application.EnableEvents = bTGGL
    Application.DisplayAlerts = bTGGL
    Application.Calculation = IIf(bTGGL, xlCalculationAutomatic, xlCalculationManual)
End Sub

Elapsed time (excluding your worksheet formula recalculation should be in the 1-2 second range.

Priority of the matching groups is left to right. If you think '55 gallon drum' should be grouped as drum and not gallon then make sure that drum comes before gallon in row 1.

Saving the new macro-enabled workbook as an Excel Binary Workbook (.XLSB) reduces the workbook file size by roughly half.

  • Thank you Jeeped, that worked amazingly!!! I really appreciate the help, and will be studying this code to understand it! – sikorloa Jan 20 '16 at 03:25
  • do you know how I could adjust the code to make the script run for whatever worksheet it's in? I believe I have to add something like this, but I haven't gotten it to work: 'Dim wksData As Worksheet Set wksData = ActiveSheet (Line 8) With Worksheets = wksData ' – sikorloa Jan 20 '16 at 16:34
  • The phrase **' whatever worksheet it's in'** is a bad thing. I don't like to write code that relies on the ActiveSheet property. With that said, I wrote the code to examine the worksheet it is processing for scope; there are no hard-coded number of rows or columns. I'd suggest renaming the `With Worksheets("Sheet1")` and testing on another worksheet with a similar layout (doesn't have to be identical). If it doesn't produce satisfactory results, post a new question lest this one become a [Russian Doll Q&A](http://meta.stackexchange.com/questions/188625/etiquette-for-russian-doll-questions). –  Jan 20 '16 at 16:42
  • Thank you Jeeped, I will play around with renaming 'With Worksheets("Sheet1")' and if I can't figure it out will post a new question. – sikorloa Jan 20 '16 at 17:00
2

I was working on something and Jeeped beat me to the answer. I tried Jeeped's code, but was getting multiple group entries for some of the strings. Here is the code I was working on if it's of any value at this point:

Sub sikorloa()

Dim r As Integer
Dim c As Integer
Dim LastRow As Integer
Dim LastCol As Integer
Dim strng As String
Dim grp As String

Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False

LastRow = Range("A" & Rows.Count).End(xlUp).Row
LastCol = Cells(1, Cells.Columns.Count).End(xlToLeft).Column

For r = 3 To LastRow
    If Cells(r, 1).Value <> "" Then
        strng = Cells(r, 1).Value
        For c = 5 To LastCol
            grp = Cells(1, c).Value
            If InStr(strng, grp) > 0 Then
                Cells(r, c).Value = Cells(r, 1).Value
                Exit For
            End If
        Next c
    End If
Next r

Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True

End Sub
SincereApathy
  • 685
  • 1
  • 7
  • 19
  • Thank you! Your narrative got me to re-examine my logic and I realized I had the order of he nested loops incorrect. Edited now. –  Jan 18 '16 at 04:00
  • Thank you SincereApathy for the input and catching the multiple group entries for some of the strings, much appreciated! – sikorloa Jan 20 '16 at 03:26