2

I take this beautiful formula from JvdV answer:

=TRIM(CONCAT(IF(ISNUMBER(SEARCH(MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1),"-./ 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ")),MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1)," ")))

This formula replace any non-alphanumeric character (&^%]#$) with simple space " ". I put in formula some exception (-./ ), but this is not all exceptions. How about wildcards? How to filter wildcards (~*?) with this formula?

I think: Ok, I will use FIND instead of SEARCH and all will be right, just put lowercase and uppercase alphabet in the FIND index, like this: *"-./ 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"*

Then I think: But, what if I want to keep not only numeric and regular alphabet? What if I want to keep all diacritics, like this: "ÁÀȦÄǍĀÃÅĄȺẤẦẮẰǠǺǞẪẴẢȀȂẨẲẠḀẬẶĂÂḂɃƁḄḆĆĊĈČÇȻḈƇƆḊĎḐĐƊḌḒḎÐƉÉÈĖÊËĚĔĒẼĘȨɆẾỀḖḔỄḜẺȄȆỂẸḘḚỆÉÈÊËḞƑǴĠĜǦĞḠĢǤƓḢĤḦȞḨĦḤḪⱧÍÌİÏǏĬĪĨĮƗḮỈȈȊỊḬÍÌÏÎȷĴǰḰǨĶƘᶄḲḴⱩꝀꝂꝄĹĿĽⱢⱠĻȽŁḶḼḺḸꝈḾṀṂŃǸṄŇÑŅƝṆṊṈÑŊÓÒȮÔÖǑŎŌÕǪŐỐỒƟØṒṐṌȪỖṎǾȬǬỎȌȎƠỔỌỚỜỠỘỞỢÓÒÔÖÕØṔṖⱣƤƦŔṘŘŖɌⱤȐȒṚṞṜŚṠŜŠṤṦṢṨŞṪŤƬṬƮṰṮȾŢŦÚÙÛÜǓŬŪŨŮŲŰɄǗǛṸṺỦȔȖƯỤṲỨỪṶṴỮỬỰÚÙÛÜṼṾẂẀẆŴẄẈẊẌÝỲẎŶŸȲỸɎỶƳỴÝŹŻẐŽƵẒẔ"

Then lowercase and uppercase alphabet is too much for FIND index.

Ok, for SEARCH index is also too much, because function accept max. 255 length, but lets say we have only 200 characters in index (numbers, alphabet and some diacritics)

So, the question is available:

How to filter (replace with space) wildcards (~*?) with this kind of formula?

JvdV
  • 70,606
  • 8
  • 39
  • 70
Losai
  • 329
  • 2
  • 9

1 Answers1

2

As I read this question there are a few problems:

  • How to include over 255 characters in the 2nd parameter of SEARCH();
  • How to exclude literal wildcard characters in the 2nd parameter of SEARCH();

One way around the length limit is to feed SEARCH() an array of options, in this case an array of two elements of a lenght of <255:

enter image description here

Formula in C1:

=TRIM(CONCAT(IF(MMULT(IFERROR(SEARCH("~"&MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1),{"ÁÀȦÄǍĀÃÅĄȺẤẦẮẰǠǺǞẪẴẢȀȂẨẲẠḀẬẶĂÂḂɃƁḄḆĆĊĈČÇȻḈƇƆḊĎḐĐƊḌḒḎÐƉÉÈĖÊËĚĔĒẼĘȨɆẾỀḖḔỄḜẺȄȆỂẸḘḚỆÉÈÊËḞƑǴĠĜǦĞḠĢǤƓḢĤḦȞḨĦḤḪⱧÍÌİÏǏĬĪĨĮƗḮỈȈȊỊḬÍÌÏÎȷĴǰḰǨĶƘᶄḲḴⱩꝀꝂꝄĹĿĽⱢⱠĻȽŁḶḼḺḸꝈḾṀṂŃǸṄŇÑŅƝṆṊṈÑŊÓÒȮÔÖǑŎŌÕǪŐỐỒƟØṒṐṌȪỖṎǾȬǬỎȌȎƠỔỌỚỜỠỘỞỢÓÒÔÖÕØṔṖⱣƤƦŔṘŘŖɌⱤ";"ȐȒṚṞṜŚṠŜŠṤṦṢṨŞṪŤƬṬƮṰṮȾŢŦÚÙÛÜǓŬŪŨŮŲŰɄǗǛṸṺỦȔȖƯỤṲỨỪṶṴỮỬỰÚÙÛÜṼṾẂẀẆŴẄẈẊẌÝỲẎŶŸȲỸɎỶƳỴÝŹŻẐŽƵẒẔ-./*? 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"}),0),{1,1}),MID(A1,ROW(A$1:INDEX(A:A,LEN(A1))),1)," ")))

What we did here is:

  • Use an horizontal array {abc;xyz} to check against our characters which was an vertical array {a,b,c}. Note the difference between semi-column and comma.
  • The result will be a 2D-array which MMULT() can sum. Meaning if the character was found in any of the two elements of the array it will return that same character. Otherwise, a space.
  • The special wildcard characters are now also included with an extra tilde to escape them as with actually all characters.
  • If Excel doesn't recognize all lowercase diacritics as their uppercase counterparts, just add them to one of the two elements. If need be, add a 3rd. But know that you'd need to extend on the 2nd parameter in MMULT() too then.

To visualize the above:

enter image description here

Remember, you are using Excel 2019 which means you need to CSE-enter this formula. Needles to say that all will be much easier in ms365 using its dynamic array functionality.

JvdV
  • 70,606
  • 8
  • 39
  • 70
  • Yes you right about 365, but not all the people have this benefits. Also my version of excel use another punctuation ; and \ . But its ok. And you show the perfect way to keep in SEARCH a long string, using MMULT. But about the wildcards, i want to filter (trim) them, also replace with space, not keep them :) – Losai Mar 10 '22 at 09:00
  • 1
    @Losai, edited the answer. One would then need to make all characters literals with an concatenated tilde. – JvdV Mar 10 '22 at 09:06
  • 1
    Just a big Thanks to You, my friend and your knowledge share. Its working perfect like all yours answers. Thank you. – Losai Mar 10 '22 at 09:14