1

So here's the problem, I have a table with equipment names where I want to split text into columns until capital letter or number is met.

Here's how table looks like:

Table as-is

Here's how it should like after splitting:

Table to-be

I was thinking using left function but dont know how to include capital letter condition! Here's what I tried:

 =LEFT(A2,SMALL(FIND(CHAR(ROW(INDIRECT("65:90"))),A2&"ABCDEFGHIJKLMNOPQRSTUVWXYZ"&"АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ"&"1234567890" ),2)-1)

It doesn't work unfortunately... Important note: capital Letters can be written both in cyrillic and latin

Also, is it possible possible to accomplish this goal easier in Power Query?

Thanks!

Edit: added my attempts

Adren
  • 83
  • 7
  • Please include your current attempts. – QHarr Apr 29 '19 at 08:09
  • You can use `EXACT` and `UPPER` to check if some text is in Upper Case: e.g. `=EXACT(A1,UPPER(A1))` will tell you if the text of cell A1 is all in upper case or not – Chronocidal Apr 29 '19 at 08:11
  • I recomend to use [Regular Expressions](https://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops) and for example a pattern like this: `([A-Z]?[a-z ]+) (.*)` https://regex101.com/r/aH7P9t/1 – Pᴇʜ Apr 29 '19 at 08:13
  • How do I work around cyrillic letters? – Adren Apr 29 '19 at 08:19
  • if that is an option, it would be easier to copy-paste into Google Sheets or Notepad++ as they support regular expressions – Slai Apr 29 '19 at 08:20
  • @Slai VBA supports RegEx too if you use the *Microsoft VBScript Regular Expressions* library (see the link in my comment above). – Pᴇʜ Apr 29 '19 at 08:22
  • @MDoskarin have a look here: [How to match Cyrillic characters with a regular expression](https://stackoverflow.com/questions/1716609/how-to-match-cyrillic-characters-with-a-regular-expression) – Pᴇʜ Apr 29 '19 at 08:24
  • What would code in VBA look like using this library? – Adren Apr 29 '19 at 08:30
  • @MDoskarin I already posted the link with an example code in my first comment: [Regular Expressions](https://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops) – Pᴇʜ Apr 29 '19 at 09:15

2 Answers2

1

Here is a way without VBA/RegEx:

enter image description here

Formula in B1:

{=LEFT(A1,SMALL(IFERROR(MATCH(FullList,UNICODE(MID(A1,ROW(INDIRECT("1:"&LEN(A1)))+1,1)),0),0),SUMPRODUCT(--(IFERROR(MATCH(FullList,UNICODE(MID(A1,ROW(INDIRECT("1:"&LEN(A1)))+1,1)),0),0)=0))+1))}

Notice to enter as array through CtrlShiftEnter

Formula in C1:

=RIGHT(A1,LEN(A1)-LEN(B1))
  • Notice that there is a trailing space in column B.

  • FullList is a named range in another sheet refering to a list of UNICODES that stand for all capital Latin and Cyrillic letters plus numbers. So a list from 48-57, 65-90 and 1040-1071.

JvdV
  • 70,606
  • 8
  • 39
  • 70
  • Thank you! It works but only if capital letter is in latin, how do I add cyrillic? – Adren Apr 29 '19 at 09:19
  • Ahh right, I didn't see that part :) sorry. I'll see if I can work around this later. Should I leave this answer? – JvdV Apr 29 '19 at 09:31
  • Definitely! I think it just needs few adjustments to be working. So far I have it like this: https://i.imgur.com/i8ymQPf.jpg The capital letters in cyrillic (like cells B2: АВГ part) shouldn't be included and need to be in column C with other characters – Adren Apr 29 '19 at 09:33
  • What version of Excel are you using, I believe the answer can be found through UNICODE which was introduced in Excel 2013. I can only test this hypotheses when I have a later version of Excel myself, later today. So the hypotheses would be that `{=LEFT(A1,MATCH(1,(UNICODE(MID(A1,ROW($A$1:$A$1071)+1,1))<=1071)*(UNICODE(MID(A1,ROW($A$1:$A$1071)+1,1))>=1040),FALSE))}` gives you the answer in case your string has Cyrillic capitals. It will have to be mixed with Unicodes for Latin capitals (65-90) and Numbers (48-57). – JvdV Apr 29 '19 at 11:44
  • Works great for cyrillic capitals now! I need to take latin capitals into account too! How do I add them into existing formula?:) – Adren Apr 29 '19 at 11:50
1

You can do this with Power Query. I'm sure there is a cleaner way to do this, but SplitOnAnyDelimiter does not seem to have an option to split on just the first instance, and I don't have time to write everything into concise M-code, so for now, I added some custom columns:

  • Strip off the first letter so as to avoid finding a Capital letter at the beginning.
  • Then find the position of the first occurrence any digit, capital Latin, capital Cyrillic letter
  • Return the part of the string before that position.
  • Return the part of the string after that position.
  • Delete the unwanted columns

The M-Code:

let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "2nd", each Text.Middle([Name],1)),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Position", each Text.PositionOfAny([2nd],{"0".."9","A".."Z","А".."Я"},Occurrence.First)),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Equipment", each Text.Start([Name],[Position])),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Custom", each Text.Middle([2nd],[Position])),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom3",{"Name", "2nd", "Position"})
in
    #"Removed Columns"

Source

enter image description here

Results

enter image description here

EDIT I suspect the following code might be more efficient, as the only columns added are the results columns:

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],

#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Equipment", each Text.Start([Name],
        Text.PositionOfAny(Text.Middle([Name],1),{"0".."9","A".."Z","А".."Я"},Occurrence.First))),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Part No", each Text.Middle([Name],
    Text.PositionOfAny(Text.Middle([Name],1),{"0".."9","A".."Z","А".."Я"},Occurrence.First)+1)),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"Name"})
in
    #"Removed Columns"
Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60