1

I need to do many Regex replacements (~ 100 currently, but the list will grow) on a range of cells (varies, but up to 4 or 5 digit cell count).
Currently, my working draft is to loop through all cells repeatedly for each pattern, but obviously that's many loops.

Ideally, I'd call something like (pseudocode):

Sheet.Range("A1:G1000").RegexReplace(pattern, replacement)

However, the nearest thing is Range.Replace which only mentions "The string you want Microsoft Excel to search for".

The list of Regex.Replace overloads does not mention anything related to cells or ranges.

So, since Range.RegexReplace seems to be out - is there a more efficient way to replace many patterns in many cells than to loop through each pattern, row and column?

Imanuel
  • 3,596
  • 4
  • 24
  • 46
  • Short answer, Nope. RegExp and Excel are two different worlds, the former was not designed for the latter, unfortunately. – A.S.H May 08 '17 at 15:57
  • Thought so. Thanks nonetheless. – Imanuel May 08 '17 at 15:57
  • 1
    Long answer - Convert range to string and replace there ... http://stackoverflow.com/questions/41777996/how-can-i-convert-a-range-to-a-string-vba – Vityata May 08 '17 at 15:58
  • 5
    I would personally dump both (range & your collections of regex patterns) to arrays and loop through them both. INF faster than going through the sheets. IF you work in arrays even though your iterating x by y itll be super fast – Doug Coats May 08 '17 at 16:00
  • And exactly what do you think `Range("A1:G1000").RegexReplace` would be doing with every single cell in that range? – Mathieu Guindon May 08 '17 at 16:10
  • Sure it would have to run the regex on each cell, but I would assume the function to be better optimized than a simple loop through the cells (of course, @DougCoats has given a hint to improve this). The same thing applies to `Range.Replace`, which exists. – Imanuel May 08 '17 at 16:17

1 Answers1

3

Don't iterate cells. Whether you're writing VBA, C#, or VB.NET, if you're working against Range objects in nested loops you're doing the single slowest thing you could possibly do with the Excel object model.

Work against an array instead - you need a function like this in your toolbox:

Public Function ToArray(ByVal target As Range) As Variant
    Select Case True
        Case target.Count = 1
            'singe cell
            ToArray = Array(target.Value)
        Case target.Rows.Count = 1
            'horizontal 1D range
            ToArray = Application.WorksheetFunction.Transpose(Application.WorksheetFunction.Transpose(target.Value))
        Case target.columns.Count = 1
            'vertical 1D range
            ToArray = Application.WorksheetFunction.Transpose(target.Value)
        Case Else
            '2D array: let Excel to the conversion itself
            ToArray = target.Value
    End Select
End Function

Now you iterate an in-memory array of values (with For loops) and for each value you iterate a number of Regex.Replace calls - cache and reuse the Regex objects as much as possible, so you're not re-creating the same objects over and over and over for thousands of values.

Once you've traversed the entire array, dump it into the worksheet (resize and transpose as needed), and voilà - you've instantly rewritten thousands of cells in a single operation.

Mathieu Guindon
  • 69,817
  • 8
  • 107
  • 235
  • @MatsMug isnt that just a prettier looking chunk of text? lol also obliguhtory typo just for you – Doug Coats May 08 '17 at 16:40
  • Thank you! I didn't use your function, though. I used [RangeToStringArray by Simon Cowen](http://stackoverflow.com/questions/6773232/excel-vba-range-to-string-array-in-1-step) since I only do string operations. – Imanuel May 08 '17 at 17:10
  • @Pharaoh nested loops to convert the `Variant/String` to a `String` kinda defeat the purpose, but hey your call. – Mathieu Guindon May 08 '17 at 17:15
  • My assumption is that I have to do many loops (once for each pattern) to iterate the array anyways. If the string processing is faster with plain strings (assumption, too), I would bet my money on preprocessing the array and work with a string array. Please correct me if I'm wrong, though. – Imanuel May 08 '17 at 17:18
  • You'll want an explicit `CStr` conversion at one point, indeed. But I'd rather have that conversion happen as late as possible, when I'm ready to regex-replace the would-be-String value, so that I only need to iterate the entire array *once*. Iterating the array *twice* to save on implicit Variant-to-String conversion doesn't seem right, especially since you're paying the conversion price *anyway* – Mathieu Guindon May 08 '17 at 17:21
  • Sounds reasonable, yes. Your remark *"iterate the entire array once"* gave me the idea to prepare all regex objects in an array first and iterate this array for each cell instead of iterating the whole cell array once per pattern. – Imanuel May 08 '17 at 17:40
  • That's the spirit! – Mathieu Guindon May 08 '17 at 17:41