0

I tried almost all the methods (CLEAN,TRIM,SUBSTITUTE) trying to remove the character hiding in the beginning and the end of a text. In my case, I downloaded the bill of material report from oracle ERP and found that the item codes are a victim of hidden characters.

After so many findings, I was able to trace which character is hidden and found out that it's a question mark'?' (via VBA code in another thread) both at the front and the end. You can take this item code‭: ‭11301-21‬

If you paste the above into your excel and see its length =LEN(), you can understand my problem much better.

I need a good solution for this problem. Therefore please help!

Thank you very much in advance.

Community
  • 1
  • 1
Umar Ahmad
  • 23
  • 1
  • 4
  • are your data always like 11301-21? I mean, it's always 5 digit number - 2 digit number? – Foxfire And Burns And Burns Mar 21 '18 at 11:34
  • No. There are item codes of various lengths. But every item code is accompanied by a strange character. If you paste the item code in my question thread above and see its length in excel it may give 9 or 10 characters when it should be 8. – Umar Ahmad Mar 21 '18 at 11:39
  • Yes, I noticed that. Just asking to provide you a solution, because there is a way to extract only the numbers of this item code, getting the value `1130121`. Next step would be how to assemble again the numbers as item code again, with the same estructure. Are all your item codes like number - number? I mean, is there always a - in the item code and nothing else? – Foxfire And Burns And Burns Mar 21 '18 at 11:42
  • Can you post several examples of item codes? Maybe we can try to fix something – Foxfire And Burns And Burns Mar 21 '18 at 11:54
  • ‭12-92003-53‬ ‭12-..1051A607‬ ‭12-1074F045‬ ‭12-1070F089‬ Hope it helps – Umar Ahmad Mar 21 '18 at 11:58

2 Answers2

2

Thanks to Gary's Student, because his answer inspired me.

Also, I used this answer for this code.

This function will clean every single char of your data, so it should work for you. You need 2 functions: 1 to clean the Unicode chars, and other one to clean your item codes_

        Public Function CLEAN_ITEM_CODE(ByRef ThisCell As Range) As String
    If ThisCell.Count > 1 Or ThisCell.Count < 1 Then
        CLEAN_ITEM_CODE = "Only single cells allowed"
        Exit Function
    End If


    Dim ZZ As Byte

    For ZZ = 1 To Len(ThisCell.Value) Step 1
        CLEAN_ITEM_CODE = CLEAN_ITEM_CODE & GetStrippedText(Mid(ThisCell.Value, ZZ, 1))
    Next ZZ


    End Function

    Private Function GetStrippedText(txt As String) As String
If txt = "–" Then
    GetStrippedText = "–"
Else
    Dim regEx As Object

    Set regEx = CreateObject("vbscript.regexp")
    regEx.Pattern = "[^\u0000-\u007F]"
    GetStrippedText = regEx.Replace(txt, "")
End If

End Function

And this is what i get using it as formula in Excel. Note the difference in the Len of strings:

enter image description here

Hope this helps

1

You have characters that look like a space character, but are not. They are UniCode 8236 & 8237.

Just replace them with a space character (ASCII 32).

EDIT#1:

Based on the string in your post, the following VBA macro will replace UniCode characters 8236 amd 8237 with simple space characters:

Sub Kleanup()
Dim N1 As Long, N2 As Long
Dim Bad1 As String, Bad2 As String

N1 = 8237
Bad1 = ChrW(N1)
N2 = 8236
Bad2 = ChrW(N2)

Cells.Replace what:=Bad1, replacement:=" ", lookat:=xlPart
Cells.Replace what:=Bad2, replacement:=" ", lookat:=xlPart

End Sub
Gary's Student
  • 95,722
  • 10
  • 59
  • 99