I'm working on a VBA routine to import stuff from Excel into Sparx Enterprise Architect.
One of the challenges is to get the formatted text (bold, italic and underline) translated. EA uses some kind of html like formatting tags for it's text. So this:
this text has bold and italic
has to be translated to this:
this text has <b>bold</b> and <i>italic</i>
If found this routine on in another question that I slightly modified to fit my need. It does exactly what I need, but it is excruciating slow
'-------------------------------------------------------------
' Author: Geert Bellekens (copied from stackoverflow: https://stackoverflow.com/questions/29916992/extract-text-content-from-cell-with-bold-italic-etc)
' Date: 02/09/2019
' Description: Returns a html formatted string for the (formatted) text in a cell
'-------------------------------------------------------------
Public Function getHTMLFormattedString(r As range) As String
Dim startTimeStamp As Double
startTimeStamp = Timer
Dim isBold As Boolean
Dim isItalic As Boolean
Dim isUnderlined As Boolean
isBold = False
isItalic = False
isUnderlined = False
Dim text As String
text = ""
Dim cCount As Integer
cCount = 0
Dim modifiers As New Collection
On Error Resume Next
cCount = r.Characters.Count
On Error GoTo 0
If cCount > 0 Then
For i = 1 To cCount
Set c = r.Characters(i, 1)
If isBold And Not c.Font.Bold Then
isBold = False
text = removeModifier("b", text, modifiers)
End If
If isItalic And Not c.Font.Italic Then
isItalic = False
text = removeModifier("i", text, modifiers)
End If
If isUnderlined And c.Font.Underline = xlUnderlineStyleNone Then
isUnderlined = False
text = removeModifier("u", text, modifiers)
End If
If c.Font.Bold And Not isBold Then
isBold = True
text = addModifier("b", text, modifiers)
End If
If c.Font.Italic And Not isItalic Then
isItalic = True
text = addModifier("i", text, modifiers)
End If
If Not (c.Font.Underline = xlUnderlineStyleNone) And Not isUnderlined Then
isUnderlined = True
text = addModifier("u", text, modifiers)
End If
text = text & c.text
If i = cCount Then
text = closeAllModifiers(text, modifiers)
End If
Next i
Else
text = r.text
If r.Font.Bold Then
text = "<b>" & text & "</b>"
End If
If r.Font.Italic Then
text = "<i>" & text & "</i>"
End If
If Not (r.Font.Underline = xlUnderlineStyleNone) Then
text = "<u>" & text & "</u>"
End If
End If
'replace newline with CRLF
text = Replace(text, Chr(10), vbNewLine)
'return
getHTMLFormattedString = text
'get processingtime
MsgBox "processed " & Len(text) & " characters in " & Format(Timer - startTimeStamp, "00.00") & " seconds"
End Function
I tested this code with a lorem ipsum string of 1000 characters, without any formatting, and that processes in 4.89 seconds.
Question: What can I do to improve the performance?
- Is there a better way to loop all characters?
- Can I somehow detect if a cell has no formatting at all (and thus skip the whole routine)?