1

I have a list in Excel like the following:

1 / 6 / 45
123
1546
123 456 
1247 /% 456 /

I want to create a new column with all sequences of consecutive non digits replaced by a character. In Google Sheets, this is easy using =REGEXREPLACE(A1&"/","\D+",","), resulting in:

1,6,45,
123,
1546,
123,456 
1247,456,

In that formula A1&"/" is needed in order for REGEXREPLACE to work with numbers. No big deal, just adds a comma at the end.

How can we do this in Excel? Pure Power Query (not R, not Python, just M) is very much encouraged. VBA and other clickable Excel features are unacceptable (like find and replace).

ZygD
  • 22,092
  • 39
  • 79
  • 102
Ivan
  • 159
  • 1
  • 9

4 Answers4

3

If you have Excel 365:

enter image description here

In B1:

=LET(X,MID(A1,SEQUENCE(LEN(A1)),1),SUBSTITUTE(TRIM(CONCAT(IF(ISNUMBER(--X),X," ")))," ",","))

Or if streaks of digits are always delimited by at least a space:

=TEXTJOIN(",",,FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.*0=0]"))

Another option, if you have got access to it, is LAMBDA(). Make a function to replace all kind of characters, something along the lines of this. Without LAMBDA() and TEXTJOIN() I think your best bet would be to start nesting SUBSTITUTE() functions.

JvdV
  • 70,606
  • 8
  • 39
  • 70
  • You're a lifesaver. This should have 20000 upvotes! Why it is so difficult to do the most basic things in Excel, the world may never know -_- – Ruslan Mar 06 '22 at 22:26
  • @Ruslan, glad this old post has helped you out! – JvdV Mar 07 '22 at 07:32
2

Here is a Power Query solution. It makes use of the List.Accumulate function to determine whether to add a digit, or a comma, to the string:

Note that the code replicates what you show for results. If you prefer to avoid trailing (and/or leading) commas, it can be easily modified.

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "textToList", each List.Combine({Text.ToList([Column1]),{","}})),
    
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "commaTerminators", each List.Accumulate(
      [textToList],"", (state,current) => 
            if List.Contains({"0".."9"},current)
                then state & current
            else if Text.EndsWith(state,",") 
                then state  
            else state & ",")),
        
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"textToList"})
in
    #"Removed Columns"

enter image description here

Edit To eliminate leading/trailing commas, we add the Text.Trim function which, in Power Query, allows defining a specific text to Trim from the start/end:

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "textToList", each List.Combine({Text.ToList([Column1]),{","}})),
    
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "commaTerminators", each 
    Text.Trim(
        List.Accumulate(
            [textToList],"", (state,current) => 
                if List.Contains({"0".."9"},current)
                    then state & current
                else if Text.EndsWith(state,",") 
                    then state  
                else state & ","),
        ",")),
        
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"textToList"})
in
    #"Removed Columns"

enter image description here

VBA UDF You mentioned you did not want VBA, but not clear if you were restricting that to a "clickable". Here is a user defined function that you can use on a worksheet directly. It uses the VBA regex engine which allows easy extraction of multiple matches

You can enter a formula on the worksheet such as =commaSep(cell_ref) to get the same results as shown above in my second PQ example

Option Explicit
Function commaSep(S As String) As String
    Dim RE As Object, MC As Object, M As Object
    Dim sTemp As String
    
Set RE = CreateObject("vbscript.regexp")
With RE
    .Global = True
    .Pattern = "\d+"
    If .test(S) Then
        Set MC = .Execute(S)
        sTemp = ""
        For Each M In MC
            sTemp = sTemp & "," & M
        Next M
        commaSep = Mid(sTemp, 2)
    Else
        commaSep = "no digits"
    End If
End With
Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60
1

This is another variation if you have TEXTJOIN function available.

=SUBSTITUTE(TRIM(TEXTJOIN("",TRUE,IFERROR(MID(A2,ROW($A$1:INDEX(A:A,LEN(A2))),1)+0," ")))," ",",")

shrivallabha.redij
  • 5,832
  • 1
  • 12
  • 27
1

And another option in Power Query.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTQVzADYhNTpVgdINfIGEKbmpjBBIByZgpQjom5gr4qWEBfKTYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    x1 = Table.AddColumn(#"Changed Type", "x1", each Text.ToList([Column1])),
    x2 = Table.AddColumn(x1, "x2", each List.Transform([x1], each if Text.Contains("0123456789", _) then _ else " "  )),
    x3 = Table.AddColumn(x2, "x3", each Text.Split(Text.Combine([x2])," ")),
    x4 = Table.AddColumn(x3, "x4", each List.Transform([x3], each if Text.Contains("0123456789", try Text.At(_,0) otherwise " ") then _&"," else "" )),
    x5 = Table.AddColumn(x4, "x5", each Text.Combine([x4])),
    #"Removed Columns" = Table.RemoveColumns(x5,{"x1", "x2", "x3", "x4"})
in
    #"Removed Columns"

enter image description here

ZygD
  • 22,092
  • 39
  • 79
  • 102