0

In a first Excel File multiple Cells in Column C contains the address and the name of a company; I want to keep only the company name. For that, I have another Excel file (I'll call it "Dictionary"), which has a particular structure like the following:

Column B : Name that I want to keep.
Column C : Various Patterns of the name, delimited with ";".
Example : B1 = "Sony", C1="Sony Entertainement;Sony Pictures;Playstation"

I need VBA macro reading Dictionary File, then for each pattern (surrounded with anything) replace it with the word I want to keep.

My macro would look like :

Sub MacroClear()

   <For each line of my dictionnary>
        arrayC = split(<cell C of my line>, ";")
        <For i in range arrayC>
           Cells.Replace What:="*"&Trim(arrayC(i))&"*", Replacement:=Trim(<cell B of my line>), LookAt:= _
              xlPart, SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
              ReplaceFormat:=False
End Sub

EDIT - UPDATE : I made a capture of my first Dictionary, it'll be easier to understand the structure :

dictionnary http://img11.hostingpics.net/pics/403257dictionnary.png

EDIT - UPDATE 2 : I made a screen cap of a "non-cleaned" file, then the result I want at the end.

Not cleaned : noclean http://img11.hostingpics.net/pics/418501notcleaned.png

Cleaned :     clean http://img11.hostingpics.net/pics/221530cleaned.png

PS : I know my macro as it is would analyze all the cells of my worksheet, is it possible "easily" to tell her to ignore column A ?

EDIT - UPDATE 3 : My macro runs well with small dictionaries, but when it grows bigger, my macro doesn't stop running and I have to close excel with Ctrl + Alt + Suppr. :x Is there a way to tell her to stop when reaching a point ?

For example, using xlByRows and writing "END" at the first cell after my last row.

R3uK
  • 14,417
  • 7
  • 43
  • 77
Malik
  • 207
  • 1
  • 2
  • 14
  • 1
    Please clarify your business logic: do you want the B2 content replacement based on the match found in B1 in the same Worksheet, like "Sony Entertainement;Sony Pictures;Playstation" to become just "Sony"? Thanks and regards, – Alexander Bell Jun 10 '15 at 14:36
  • @AlexBell I've made a mistake, the "B2" I wrote = C1 actually ^^' I'm gonna edit my post to correct this. Yes, I want "Sony Entertainement", "Sony Pictures" and "Playstation" (C1) to be replaced by "Sony" (B1). Here is the [start of my first dictionnary](http://img11.hostingpics.net/pics/403257dictionnary.png). Hope it helped you ! Regards, – Malik Jun 10 '15 at 14:47
  • please show us an example of your expected output after the macro is run – Brino Jun 10 '15 at 15:05
  • @Brino I edited my question for that. ;) – Malik Jun 11 '15 at 08:50

2 Answers2

1

Based on your clarification, you can complete this task using Excel Formula like, for example =IF(ISERROR(SEARCH(B1,C1)),C1,B1) entered in cell D1 (returns "Sony" as per your sample data):

B           C                                               D
Sony        Sony Entertainement;Sony Pictures;Playstation   Sony
Panasonic   Panasonic Corporation; Matsushita               Panasonic
Samsung     Samsung Group;SamsungGalaxy;SamsungApps         Samsung

You can extend the Formula to entire Range, so column D will display the "clean" trimmed data. Also, you can automate this procedure via Excel VBA upon necessity.

NOTE: Pertinent to the 2nd answer posted, which include VBA iteration, you can use similar VBA formula using VBA InStr() function instead of Split() and Replace(), like:

For i = 1 To wsD.Range("C" & wsD.Rows.Count).End(xlUp).Row
    For k = 1 To wsC.Range("C" & wsC.Rows.Count).End(xlUp).Row
      If (InStr(wsC.Cells(k,3).Value, wsD.Cells(i,2).Value)>0 Then 
          'you can assign the value to the Cell in Column C: wsC.Cells(k,3) 
          wsC.Cells(k,4) = wsD.Cells(i,2)  
      End If
    Next k
Next i

Hope this may help.

Alexander Bell
  • 7,842
  • 3
  • 26
  • 42
1

This is the literal translation of what you shown :

Sub MacroClear()

Dim wbD As Workbook, _
    wbC As Workbook, _
    wsD As Worksheet, _
    wsC As Worksheet, _
    Dic() As String
'Replace the names in here with yours
Set wbD = Workbooks("Dictionnary")
Set wbC = Workbooks("FileToClean")
Set wsD = wbD.Worksheets("Name1")
Set wsC = wbC.Worksheets("Name2")

For i = 1 To wsD.Range("C" & wsD.Rows.Count).End(xlUp).Row
    Dic = Split(wsD.Cells(i, 3), ";")
    For k = 1 To wsC.Range("C" & wsC.Rows.Count).End(xlUp).Row
       Cells.Replace What:=Trim(Dic(i)), _
            Replacement:=Trim(wsD.Cells(i, 2)), _
            LookAt:=xlPart, _
            SearchOrder:=xlByRows, _
            MatchCase:=False, _
            SearchFormat:=False, _
            ReplaceFormat:=False
    Next k
Next i

Set wbD = Nothing
Set wbC = Nothing
Set wsD = Nothing
Set wsC = Nothing

End Sub

And the updated version :

Sub MacroClear()

Dim wbD As Workbook, _
    wbC As Workbook, _
    wsD As Worksheet, _
    wsC As Worksheet, _
    DicC() As Variant, _
    Dic() As String, _
    ValToReplace As String, _
    IsInDic As Boolean, _
    rCell As Range

'Replace the names in here with yours
Set wbD = Workbooks.Open("D:\Users\maw\Documents\resources\Dict.xlsx", ReadOnly:=True)
Set wbC = Workbooks("TestVBA")
Set wsD = wbD.Worksheets("Name1")
Set wsC = wbC.Worksheets("Name2")
'Set global dictionnary dimension
ReDim DicC(1, 0)

For i = 1 To wsD.Range("C" & wsD.Rows.Count).End(xlUp).Row
    Dic = Split(wsD.Cells(i, 3), ";")
    ValToReplace = Trim(wsD.Cells(i, 2))
    For k = LBound(Dic) To UBound(Dic)
        IsInDic = False
        For l = LBound(DicC, 2) To UBound(DicC, 2)
            If LCase(DicC(1, l)) <> Trim(LCase(Dic(k))) Then
                'No match
            Else
                'Match
                IsInDic = True
                Exit For
            End If
        Next l
        If IsInDic Then
            'Don't add to DicC
        Else
            DicC(0, UBound(DicC, 2)) = Trim(Dic(k))
            DicC(1, UBound(DicC, 2)) = ValToReplace
            ReDim Preserve DicC(UBound(DicC, 1), UBound(DicC, 2) + 1)
        End If
    Next k
Next i

ReDim Preserve DicC(UBound(DicC, 1), UBound(DicC, 2) - 1)
wbD.Close
Erase Dic


For Each rCell In wsC.Range("C2:C" & wsC.Range("C" & wsC.Rows.Count).End(xlUp).Row).End(xlUp).Row
    For l = LBound(DicC, 2) To UBound(DicC, 2)
        If InStr(1, rCell.Value2, DicC(0, l)) <> 0 Then
            rCell.Value2 = DicC(1, l)
        Else
            'Not found
        End If
    Next l
Next rCell


Erase DicC
Set wbD = Nothing
Set wbC = Nothing
Set wsD = Nothing
Set wsC = Nothing

End Sub
R3uK
  • 14,417
  • 7
  • 43
  • 77
  • Hi, thank you for your answer ! However it doesn't work, I've got an "error '9' ". I've opened my file `TestVBA.xlsm`, which is in `D:\Users\maw\Desktop\excel\` and saved your macro. My dictionnary was in the same folder and named `Dict.xlsx`, so I just renamed the 2 "Workbooks" by the name of the file, but still doesn't work. Ideally my dictionnary would be closed, and the path would be : `D:\Users\maw\Documents\resources\Dict.xlsx`. Thank you ! – Malik Jun 11 '15 at 08:34
  • Ok try the updated version, (don't forget to rename the Sheets as I don't know what their names are (here `Name1` and `Name2`)) which should open your Dictionnary and close it after (so don't open it or it may throw an error) – R3uK Jun 11 '15 at 08:59
  • When I try it I have an error 9 at the line : `Set wbC = Workbooks("TestVBA")`, so I replaced "TestVBA" with "TestVBA.xslm", and now I have an error 9 at : `If LCase(DicC(1, l)) <> Trim(LCase(Dic(k))) Then`. I have made a [screen capture](http://img11.hostingpics.net/pics/559233error9.png), maybe it'll be clearer. ^^ – Malik Jun 11 '15 at 09:12
  • Ok my bad, I forgot to specify which dimension in the L_Ubound of DicC which has 2 dim, so it should work now! :) – R3uK Jun 11 '15 at 09:16
  • Now it works quite good ! :D The problem is that it doesn't replace the entire content of the cell when it matches, but only the expression itself. Example : `The new Sony Pictures movie` is now replaced by `The new Sony movie`, but I just want to hold `Sony` in the cell. Is that possible ? :D – Malik Jun 11 '15 at 09:26
  • Here you go! ;) It was just the Worksheet function replace that wasn't adapted ;) – R3uK Jun 11 '15 at 10:21
  • OK I solved the problem, I replaced the line `Cells.Replace What:=Trim(DicC(0, l)), _` by `Cells.Replace What:="*" & Trim(DicC(0, l)) & "*", _`. Now I have 2 minor things : 1- Can I tell the macro just to "ignore" the A column of my "TestVBA" file ? 2- I have some patterns like "EU" ==> European Union, but the problem is that when EU is in a word (example : "deuce"), it counts it as a match. Basically in Java I'd use a regex like : "[^a-zA-Z]"+Trim(word)+"[^a-zA-Z]", what is the equivalent in VBA ? – Malik Jun 11 '15 at 10:27
  • If you use my code, it'll work properly for your column C. For regex VBA : http://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops . For your other issue : `LookAt:=xlWhole` and `SearchOrder:=xlByColumns` in the replace that you posted first! – R3uK Jun 11 '15 at 12:25
  • I tried your solution with xlWhole and xlColumn, but it still takes into account my A column, is there a simple way to tell my macro "Don't touch to my column A", or should I re-write all the Replace field ? – Malik Jun 12 '15 at 14:02
  • To increase efficiency, use a Scripting Dictionary instead of the array DicC(), start with `Dim DicC` then `DicC = CreateObject("Scripting.Dictionnary")` and the for the rest, you'll find details all over the web, give it a try. If you don't succeed, post another question with a link to this one and my code with your replacements using the dictionnary so that people will take a look at the code to help you make it work! – R3uK Jun 29 '15 at 13:53
  • Haha I had just forgotten to remove the first "For" loop, so it treated the whole document for every line of my excel file >< Thank you very much anyway :p – Malik Jun 30 '15 at 12:06