1

I am trying to write a pretty big macro to pull data from one sheet and parse and transform it into different sheets.

Going pretty well in everything else. However, I am stuck with 1 part.

I have multiline text in a cell (11 lines in this example, but could be dynamic)

The sample text I am trying to build my code on is below (the lines are separated by a space and Alt+Enter after the last character in each line):

Flat where a figure is quoted but theshold based on the following lanes
IL STD  $5.00 above $75.00 USD 
SG STD  $11 above SGD400 
TR STD  $3 above €30 EUR 
AU EXP $2 for value>AUD 1000
IL EXP $2 for value>USD 75
JP EXP $2 for value>USD 65
NZ EXP $2 for value>NZD 400
PH EXP $2 for value>USD 165
SG EXP $2 for value>SGD 400
TW EXP $2 for value>TWD 2000

My requirement is to parse this text into a table as below:

enter image description here

I started with the logic that I will count the number of line breaks/carriage returns and then create arrays using loop for the number of line count.

But I haven't been able to achieve this.

I have tried the below variation of codes to get no. of lines but keep getting 0 for the output.

Sub tresholds()

Dim strTest As String
Dim NewLines As Long

strTest = ThisWorkbook.Sheets("Rate Card").Range("D10").Text
NewLines = UBound(Split(strTest, Chr(32) & vbCrLf))
'NewLines = UBound(Split(strTest, " " & vbCrLf))
'NewLines = UBound(Split(strTest, vbCrLf))
'NewLines = UBound(Split(strTest, vbLf))
'NewLines = UBound(Split(strTest, Chr(32) & vbLf))

Debug.Print NewLines

End Sub

Also, I'm not too good at regex. And there are multiple patterns in the string. Some lines have currency symbols and others dont. Some have 'value above' and others have value >. Couldn't figure out how to account for these variations.

Would really be grateful if you guys could help.

Achal Desai
  • 93
  • 1
  • 8
  • 1
    What specific problem did you run into getting your attempt to work? – braX May 05 '23 at 19:32
  • 1st I tried to get the number of lines in the string. But my code keeps returning 0 lines. This is what I have tried. Sub tresholds() Dim strTest As String Dim NewLines As Long strTest = ThisWorkbook.Sheets("Rate Card").Range("D10").Text NewLines = UBound(Split(strTest, Chr(32) & vbCrLf)) Debug.Print NewLines End Sub Also, I'm not too good at regex. And there are multiple patterns in the string line some lines have currency symbols and others dont. Some have 'value above' and others have value >. Couldn't figure out how to account for these variations. – Achal Desai May 05 '23 at 19:40
  • The data you pasted does not have a space at the end of each line, only the first 3 lines have a space. It seems unnecessary to include the space in your split. Alt+Enter is `vbLf` not `vbCrLf`. The last 2 commented out lines should return `NewLines = UBound(Split(strTest, vbLf))` = 10 and `NewLines = UBound(Split(strTest, Chr(32) & vbLf))` = 3. – kevin May 05 '23 at 22:56

2 Answers2

4

This solution uses several regex to match various patterns. You could easily pick out the individual country lines without regex, but some of the fields within each line would be difficult without it. You might as well get better with regex. You'll find it useful in the future.

Using regex with VBA requires setting a reference (Tools/References) to Microsoft VBScript Regular Expressions 5.5.

This code has no error checking; that's up to you. Also, I'm sure you can write the output to a worksheet, so I'll leave that up to you also. I've noted in the sub where the output code belongs.

If you want help with the regex, ask specific questions in a comment. Also, SO has a great regex summary: How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops

Sub ParseThreshold()
    Dim rngInput As Range, regexMatch As RegExp, strPattern As String, strInput As String, vntRc As Variant, strPartLine As String
    Dim nCount As Integer, nIndex As Integer, vntMatches As Variant, arstrItems() As String, strCountry As String, strSE As String
    Dim dblDollars As Double, dblThreshold As Double
    
    Set regexMatch = New RegExp
    Set rngInput = ActiveSheet.Range("a1")
    strInput = rngInput.Value
    
    ' Define the regex pattern to match each country's line
    strPattern = "^[A-Z]{2} *(STD|EXP) *\$[0-9.]* (above|for value>[A-Z]{3}).*$"
    
    With regexMatch
        .Pattern = strPattern
        .Global = True
        .MultiLine = True
        nCount = .Execute(strInput).Count
    End With
    
    Set vntMatches = regexMatch.Execute(strInput)
    
    If nCount > 0 Then
        ReDim arstrItems(0 To nCount - 1)
        
        For nIndex = 0 To nCount - 1 '
            arstrItems(nIndex) = vntMatches(nIndex)
            strCountry = Left(arstrItems(nIndex), 2)
            strSE = Mid(arstrItems(nIndex), 4, 3)
            
            dblDollars = DollarAmount(arstrItems(nIndex))
            dblThreshold = Threshold(arstrItems(nIndex))
            
            ' Output the values to your table here
        Next nIndex
    End If
    
End Sub  ' ParseThreshold


Function DollarAmount(strInput As String)
    Dim regexMatch As RegExp, strPattern As String, vntRc As Variant
    
    strPattern = "(STD|EXP) *\$[0-9.]*"
    Set regexMatch = New RegExp
    
    With regexMatch
        .Pattern = strPattern
        .Global = False
        .MultiLine = False
        nCount = .Execute(strInput).Count
    End With
    
    Set vntMatches = regexMatch.Execute(strInput)
    If nCount > 0 Then
        vntRc = vntMatches(0)
        vntRc = Replace(vntRc, "STD ", "")
        vntRc = Replace(vntRc, "EXP ", "")
        vntRc = Replace(vntRc, "$", "")
        vntRc = Trim(vntRc)
    End If
    
    DollarAmount = vntRc
    
End Function  ' DollarAmount


Function Threshold(strInput As String)
    Dim regexMatch As RegExp, strPattern As String, vntRc As Variant, nLength As Integer, vntMatches As Variant
    
    nLength = Len(strInput)
    Set regexMatch = New RegExp
    
    strPattern = "[0-9.]+"
    strTarget = "above"
    nPosition = InStr(strInput, strTarget)
    If nPosition > 0 Then
        strPartLine = Mid(strInput, nPosition + Len(strTarget) + 1, nLength)
    Else
        strTarget = "value>"
        nPosition = InStr(strInput, strTarget)
        If nPosition > 0 Then
            strPartLine = Mid(strInput, nPosition + Len(strTarget) + 1, nLength)
        End If
    End If
    
    With regexMatch
        .Pattern = strPattern
        .Global = False
        .MultiLine = False
        nCount = .Execute(strPartLine).Count
        Set vntMatches = .Execute(strPartLine)
    End With
    
    If nCount > 0 Then
        vntRc = vntMatches(0)
    End If
    
    Threshold = vntRc
    
End Function  ' Threshold
RichardCook
  • 846
  • 2
  • 10
  • Hi @Richard Cook, Thank you so much! This worked perfectly. I added the piece for the output to be written in the required cell addresses. So far I have only tested with the string I put into the question. Will try with more test strings to see if the regex holds. But I am hopeful that it will. Once again, Thank you so much! – Achal Desai May 07 '23 at 19:12
0

Another way....

Sheet1 cell A1 value is just like your sample data

Sub test()
Dim rslt As Range, cnt As Integer, splitVal as String
Dim i As Integer, j As Integer, k As Integer, arr

Set rslt = Sheet2.Range("A" & Rows.Count).End(xlUp).Offset(1, 0)
arr = Split(Sheet1.Range("A1").Value, Chr(10))

For i = 1 To UBound(arr)
    rslt.Value = application.trim(Split(arr(i), " ")(0))
    rslt.Offset(0, 1).Value = application.trim(Split(arr(i), " ")(1))
    Set rslt = rslt.Offset(0, 2)
    For j = 2 To UBound(Split(arr(i), " "))
        splitVal = Split(arr(i), " ")(j)
        If splitVal Like "*#*" Then
            For k = 1 To Len(splitVal)
                If Mid(splitVal, k, 1) = "." Then Exit For
                If Mid(splitVal, k, 1) >= "0" And Mid(splitVal, k, 1) <= "9" Then rslt.Value = rslt.Value & Mid(splitVal, k, 1)
            Next k
            Set rslt = rslt.Offset(0, 1)
        End If
        If rslt.Column = 5 Then Exit For
    Next j
    Set rslt = Sheet2.Range("A" & Rows.Count).End(xlUp).Offset(1, 0)
Next i

End Sub

rslt is a range variable where the result will be put. In this case example the result will be put in sheet2 starting from cell A2.

arr variable has value from sheet1 cell A1 value splitted with line break.

There are three loops in the sub.

  1. Loop for each line
  2. Loop for each "word" in the looped line splitted with space
  3. Loop for each character in the looped "word"

The first loop is to put the first word to sheet2 column A and put the 2nd word to sheet2 column B from the looped line string as i variable. Then it set the rslt to offset(0,2).

In the 2nd loop, it loop to each word in the looped line. Since the first word and the second word not needed anymore, it starts from 2 as j variable. And it put the looped word into splitVal variable, and check if the splitVal value contains number then it go to the third loop.

In the third loop, it check each character of the splitVal (the looped word) value as k variable. if the looped char is "." then it exit the loop ---> this will ignore the decimal value (since I assume that you don't need a decimal for the result).

Then it check if the looped char is >= 0 and <=9, it put the result to sheet2 column C (the rslt variable). and set rslt offset(0,1) so the next result will be put in column D.

Please note, the code will fail if the looped word is something like this SGD.400 (there is a dot which is located in front of the number)

enter image description here

karma
  • 1,999
  • 1
  • 10
  • 14
  • 1
    "Code will fail ... if the first word and the second word is separated with more than one space on each line". Also "if the first word started with space". Both of those can be avoided by calling WorksheetFunction.Trim() before parsing the line. That replaces multiple spaces with a single space, and removes leading and trailing spaces. Note that it is NOT the same as the VBA Trim() statement, which ignores multiple spaces within the line. – RichardCook May 06 '23 at 13:07
  • @RichardCook, thank you very much for your help . I will never know it if you don't tell me. I've edited the code as your suggestion. – karma May 06 '23 at 15:03