3

I am working off this answer here: Regex for removing special characters on a multilingual string:

/\P{Xan}+/u

but this appears to be for PHP, I am not any good at regex, so what would the javascript equivelent be?

When I use the regex in the example answer, I get an invalid expression error telling me there is an invalid escape?

search(event) {
    const length = (string) => {
        if (string.length > 1) {
            return true;
        }
        return false;
    };
    const trim = (string) => {
        if (string.trim() !== '') {
            return true;
        }
        return false;
    };
    const keyType = (string) => {
        const regex = /\P{Xan}+/u;
        if (!regex.exec(string)) {
            return true;
        }
        return false;
    };
    const text = this.searchListParams.searchText;
    if (length(text) && trim(text) && keyType(text)) {
        this.searchSubject.next(this.searchListParams);
    } else {
        this.mediaListParams.startRow = 0;
        this.listSubject.next(this.mediaListParams);
    }
}
Jonathan Lam
  • 16,831
  • 17
  • 68
  • 94
Sandra Willford
  • 3,459
  • 11
  • 49
  • 96

2 Answers2

2

The /\P{Xan}+/u pattern in PHP matches any 1+ chars that is not a Unicode letter or digit.

If you need to support any browser or JS implementation, use XRegExp and the [^\pL\pN]+ pattern that matches any 1+ chars other than Unicode letters (\pL) and digits (\pN):

var rx = XRegExp("[^\\pL\\pN]+", "g");
var s = "8੦৪----Łąka!!!!Вася, *** ,Café";
var res = XRegExp.replace(s, rx, ' ')
console.log("'"+s+"'", "=>", "'"+res+"'");
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/3.2.0/xregexp-all.min.js"></script>

If you plan to only support ECMAScript 2018 compatible implementations, you can use this native regex:

const rx = /[^\p{L}\p{N}]+/gu;
const s = "8੦৪----Łąka!!!!Вася, *** ,Café";
let res = s.replace(rx, " ");
console.log(`'${s}' => '${res}'`)

The u modifier is important to enable the Unicode category class support in ES2018 regex.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    and `console.log( XRegExp("[^\\pL\\pN]+") )` to get the long regular expression that doesn't need the `XRegExp` library – Slai Jun 11 '18 at 01:45
0

I'm not familiar with PHP syntax, but in JavaScript, the curly brackets {} are used as quantifiers. This is probably causing your error.

That being said, the PHP regex does not have the same meaning in JavaScript as it does in PHP. Unfortunately, AFAIK there is no predefined character class equivalent to the PHP regex you provided in JavaScript, so I don't think I can provide a regular expression to solve your question explicitly.


However, one creative potential solution that does not employ regular expressions in JS is suggested in this answer, but it will only work for Latin-based alphabets (languages with capitalization) and only for word characters (not numbers). Here is a basic implementation (modified from linked answer):

function removeSpecials(str) {
    var lower = str.toLowerCase();
    var upper = str.toUpperCase();

    var res = "";
    for(var i=0; i<lower.length; ++i) {

        // test if character or numeric using capitalization test
        if(lower[i] != upper[i] || /\d/.exec(lower[i]))
            res += str[i];

    }
    return res;
}
Jonathan Lam
  • 16,831
  • 17
  • 68
  • 94
  • 1
    On most major platforms, except javascipt, curly brackets are used for quantifiers *and* for POSIX clases, either as `\p{...}` for classes or `\P{...}` for *negated* classes – Bohemian May 16 '18 at 23:53
  • @Bohemian Does that syntax appear in JS as well? I haven't seen that before, nor do I see it now in the MDN docs. – Jonathan Lam May 16 '18 at 23:55
  • 1
    I edited my comment before you answered to add *except javascript*. It seems JS isn't coming to that party. – Bohemian May 16 '18 at 23:56
  • @Bohemian I saw that you converted my answer to a comment (before I edited it to include a potential solution). I realized at the same time that it was sort of a skimpy answer that did not provide a full solution to the OP's needs, but it did at least answer part of the question -- why the error was occurring -- right? Or should I be more careful to fully answer the question in the future? – Jonathan Lam May 17 '18 at 00:00
  • your initial answer did not answer OP's question, which was (quoting) *what would the javascript equivelent be?*. Your answer didn't give a regex OP could use. It was more a general comment about the situation. Even now you haven't answered OP's question, but instead written code, but I don't have the heart to delete it. – Bohemian May 17 '18 at 00:57
  • @Bohemian I would have thought that saying that I don't think that there is a JS equivalent and proposing an alternative solution would be an answer. But thanks for the feedback. I'll definitely keep it in mind for the future. I've made my answer more clear to reflect this. – Jonathan Lam May 17 '18 at 01:18