0

I have an excel spreadsheet that has a column, with each cell filled with pages of data stripped from a database. In each cell, 1 or 2 serial numbers may be defined. There are three known formats of serial numbers: Starting with YV and are 10 digits long. Starting with VNA and are 8 digits long. Starting with SVNA and are 9 digits long. I need to extract every copy of every serial number in that column out to a separate cell. There may be 0, 1, or 2 serial numbers in each cell, and the copied information needs to be left blank, left with a single serial number, or left with 2 serial numbers separated by a comma and a space ("VNA1234A, VNAB4321").

I don't have any code to suggest because all code I've seen in this vein is to extract an entire known word, rather than a whole word or phrase based on a few letters.

Is there some way to do this?

Long text in work notes field

I'd like the serial number VNA3FB00 from the text in the picture to be copied into the F cell to the left. And if there was another serial number, to have that copied as well.

shrivallabha.redij
  • 5,832
  • 1
  • 12
  • 27
takieda
  • 3
  • 2
  • Please do share a screenshot (or rather some sample markdown data) which includes your input and desired output. For help, see [ask] a question with a [mcve]. Also, what version of Excel have you got? – JvdV Jul 06 '20 at 11:33
  • added the picture and a brief description to the original question. – takieda Jul 06 '20 at 12:11

1 Answers1

2

This could be done with a formula and access to TEXTJOIN (available in O365 and Excel 2019). In F2 the formula:

=TEXTJOIN(", ",,FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(G1,":"," "),","," "),"."," "),";"," ")," ","</s><s>")&"</s></t>","//s[starts-with(., 'YV') and string-length(.) =10 or starts-with(., 'VNA') and string-length(.) =8 or starts-with(., 'SVNA') and string-length(.) =9]"))

enter image description here

You may need to add more substitutions in depending on how your data looks.


You could also decide to go with an UDF through VBA. Maybe something along these lines:

Function REGEX(str As String, pat As String) As String

With CreateObject("vbscript.regexp")
    .Global = True
    .Pattern = pat
    If .Test(str) = True Then
        For Each Mtch In .Execute(str)
            If REGEX = "" Then
                REGEX = Mtch
            Else
                REGEX = REGEX & ", " & Mtch
            End If
        Next
    End If
End With

End Function

You can call this function in F2, through:

=REGEX(G2,"\b(?:YV[^\W_]{8}|S?VNA[^\W_]{5})\b")

Where you may want to sway [^\W_] for [A-Z0-9] if you want to exclude lowercase letters to.

If you are unfamiliar with regular expressions then have a look at the online demo and this introductary post on regex within VBA.


enter image description here

JvdV
  • 70,606
  • 8
  • 39
  • 70
  • That seems to work for some of the values, but most of them are being missed, and instead of a blank cell, it's showing #VALUE! I'd do another screenshot, but I think that wouldn't show the reason. I suspect there are some odd formatting issues with the cells in column G. "2020-03-24 15:47:46 - (Work notes) Monitor, wired keyboard/mouse SVNA3L3A0 picked up equipment 2020-03-16 08:51:04 - (Work notes) Hello, I see that the monitor part of your ticket is incomplete. Update me with this information." – takieda Jul 06 '20 at 12:31
  • For you own sanity I'd use the UDF =) @takieda. – JvdV Jul 06 '20 at 12:33
  • The above contains a straight copy and paste of a G cell that ended with "#VALUE!" in the F cell. I don't know if that's going to have the same effect on your end though. In fact, having copy/pasted it back into the excel spreadsheet suddenly allowed it to work. – takieda Jul 06 '20 at 12:33
  • If I copy that into `G2` both the formula *and* UDF show the correct corresponding serialnumber @takieda – JvdV Jul 06 '20 at 12:37
  • I'm unfamiliar with UDF - definitions online range from United Dairy Farmers to Universal Disk Format. – takieda Jul 06 '20 at 12:37
  • Excel would let you create your own *user defined* functions (or short UDF). You can then use these functions along with the build-in functions. Also `#Value` is the error when the suggested formula won't retrieve any valid import through `FILTERXML`. A simple `IFERROR` will deal with that. – JvdV Jul 06 '20 at 12:39
  • Thank you. I will work with this and see what I can figure out. I'm just starting to learn this stuff. – takieda Jul 06 '20 at 12:48
  • As a [starting point](https://www.dummies.com/software/microsoft-office/excel/how-to-create-custom-excel-functions/) @takieda. Instead of the code they tell you to copy, copy the code I gave you. Btw, you have chosen a rather tough usecase to start of with =) – JvdV Jul 06 '20 at 12:51
  • using VBA and the REGEX call worked perfectly. I think there are formatting issues with Column G that disallow the original code you posted to work. I will test some stuff later to determine if that's true. Thank you very much for your help, JvdV! – takieda Jul 06 '20 at 12:57