I have a list of ID's that I'm trying to clean and compare with another list. The ID's have variable formatting (e.g. RFP322343BA, PPL232334, RFP32334A-00). I'm trying to standardize the data on the front-end (e.g. RFP322343, PPL232334, and RFP32234) to allow for comparison. How can I remove these end text/symbol strings of varying length?
Asked
Active
Viewed 543 times
0
-
How about just taking a substring of the first 10 characters? – Raj More Jun 16 '15 at 21:39
-
The ID's are of variable length, as is the end-positioning of the letters. Meaning that many of the ID's might be below 10 characters, and keep the letter additions intact (e.g. RFP3324AB). So a consistent sample of 10 wouldn't work, I'm trying to pull off the end letters no matter where they fall or the number of them. – NemoSum Jun 16 '15 at 21:46
-
In that case, you are going to have to use `Regular Expressions` to solve this. Here is a good primer on Regex in Excel https://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops – Raj More Jun 16 '15 at 22:01
2 Answers
0
With RFP32334A-00 in cell A1, then =IF(RIGHT(LEFT(A1,9),1)="A",LEFT(A1,8),LEFT(A1,9)) works, assuming 1) only the first 9 chars are of interest and 2) it is only ever "A" in the 9th place for the "odd" number you provided above. If there are only a few of these odd ones, then just left(a1,9) will be simpler.

Solar Mike
- 7,156
- 4
- 17
- 32
-
As mentioned above, sometimes it's the first 9 characters, sometimes it is fewer. I'd like the ability to remove the end letters/symbols regardless of their length/position. Would it be possible to count the number of them from the right and then remove them? – NemoSum Jun 16 '15 at 21:49
-
well, you could start with =LEFT(A1,LEN(A1)-4) but you still have the same problem (that I did not see when I first looked at the original submission) of defining the number of chars to drop - notice I set it to 4 in this example. – Solar Mike Jun 16 '15 at 21:56
-
The only other possibility I can think of at the moment is to do an edit&replace for the suffixes - but depends how many of them you have... – Solar Mike Jun 16 '15 at 22:08
-
There are quite a few suffixes, and some contain the same letters as the prefixes. Plus, I'd like a formula to have records of the transformation. – NemoSum Jun 16 '15 at 22:10
-
Then it looks like you will need the information that Raj More provided the link to above about Regular Expressions, also, you could then adapt the code to keep in another cell the removed chars for the records you want. – Solar Mike Jun 16 '15 at 22:18
0
Consider the following User Defined Function (UDF):
Public Function FirstPart(sIn As String) As String
Dim i As Long, L As Long, Armed As Boolean, CH As String
FirstPart = ""
Armed = False
L = Len(sIn)
For i = 1 To L
CH = Mid(sIn, i, 1)
If IsNumeric(CH) Then
Armed = True
End If
If Not Armed Then
FirstPart = FirstPart & CH
Else
If Not IsNumeric(CH) Then
Exit Function
Else
FirstPart = FirstPart & CH
End If
End If
Next i
End Function
It locates the first non-numerical character after the first numerical character and clips the string at that point.

Gary's Student
- 95,722
- 10
- 59
- 99