0

Hello I'm wondering how to filter out these custom Facebook symbols.

CULTURE CLUB PRESENTS JENNIFER CARDINI
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

We proudly presents JENNIFER CARDINI
Support by Monsieur Moustache & Thang



▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
LINE-UP
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

#TOWNHALL

▮ Monsieur Moustache

▮ Thang

▮ JENNIFER CARDINI


▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
JENNIFER CARDI ...

The expected result is the same text without the thick hyphens and music symbols.

I have very little knowledge about regex. I'm using this simple function but I think I need another replacement pattern.

 function zonderRareSymbolen(str) {
   new_string = str.replace(/^([ A-Za-z0-9_@#+-.'"]+(\r)?(\n)?)*$/g, ''); 
   return new_string.toLowerCase();
 }

Is it possible? I already tried litteraly putting pasting the symbols in the replace pattern. Can't I use a pattern that only allows letters and numbers?

UPDATE This function works if I literally put the text in:

zonderRareSymbolen('... Kisses ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Come ...');
function zonderRareSymbolen(str) {
  new_string = str.replace(/▬/g, ''); 
  return new_string;
 }

But not on my detail.description value from an object:

zonderRareSymbolen(String(detail.desciption));
function zonderRareSymbolen(str) {
  new_string = str.replace(/▬/g, ''); 
  return new_string;
 }

Any ideas?

UPDATE The replace function doesn't seem to work with the custom symbol. All I can do now is work around it and only allow the desired characters like this:

function zonderRareSymbolen(str) { return (str).replace(/[^a-zA-Z0-9- ,=?&:.'"@/]/g, ' '); }

enter image description here

SOLVED By Mariano. After putting in <meta charset="UTF-8"> I am able to litterally paste the symbols into my replace function and replace them.

  • Please provide subject string and expected result. – Mariano Nov 20 '15 at 00:55
  • The subject string is the text in the image ... the expected result is the same text without the thick hyphens and music symbols – Auguste Van Nieuwenhuyzen Nov 21 '15 at 02:18
  • Exactly, so if someone reading your post wants to copy that thick hyphen to include in a regex, he should just guess what's in the image? What if there are html tags you're not showing? What about non-breaking spaces? Where are the newlines? Think about it. You're asking people to help you parse text and you don't even give them the original text. – Mariano Nov 21 '15 at 02:24
  • Thanks for adding the subject. Notice I **[edited](http://stackoverflow.com/posts/33789665/revisions)** the post to format the text as code... You mentioned you already tried using the char as literal in the pattern. Did you try `str.replace( /▬+\r?\n?/g, '').toLowerCase();` ? – Mariano Nov 21 '15 at 02:51

1 Answers1

1

You're not assigning the replaced value to detail.description. Use one of these options:

  1. Assign the returned value to your object property:

        var detail = {
            id: 'some id',
            description: ".... Kisses\n▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬\nCome ...."
        };
        function zonderRareSymbolen(str) {
            return str.replace(/\u25AC+\r?\n?/g, '');
        }
    
        //asign the returned value
        detail.description = zonderRareSymbolen(detail.description);
    
        //print object
        result.innerText = JSON.stringify(detail);
        <body><pre id="result"></pre></body>
  2. Or call by sharing, referencing your object property in the function context:

        var detail = {
            id: 'some id',
            description: ".... Kisses\n▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬\nCome ...."
        };
        function zonderRareSymbolen(obj) {
            //obj is a reference to the object passed by value
            obj.description = obj.description.replace(/\u25AC+\r?\n?/g, '');
        }
    
        //call by sharing
        zonderRareSymbolen(detail);
    
        //print object
        result.innerText = JSON.stringify(detail);
    <body><pre id="result"></pre></body>

JavaScript passes variables by value. But an object's value is in fact a reference, thus, within the function context, a change in one of its members will persist. For detailed info, see Is JavaScript a pass-by-reference or pass-by-value language?.

Community
  • 1
  • 1
Mariano
  • 6,423
  • 4
  • 31
  • 47
  • I appreciate your effort a lot! Still I litterally tested every possibillity with your code but it does not affect my object. Strange that it doesn't work on my fetched object but it does on a static object. I figured out it the replace doesn't want to work with the unknown symbols ... so I'd rather use an approach like this `(detail.description).replace(/[^a-zA-Z-:. ]/g, ' ')` – Auguste Van Nieuwenhuyzen Nov 24 '15 at 07:24
  • @Auguste This is not related to `replace` or the regex. If it works for you here in the code snippet, it works in your script. Probably, `detail.description` has not been set yet when you call this function (is it being set with an `asynchronous` call?). I suggest checking for `if (detail.description === undefined)` before calling this function and debug it from there. But there isn't much I can do, since I don't have your code to understand what's going on... And if you can't debug it, I'd ask a new question, showing the relevant part of your code. Concentrate on `detail`, not the `replace` – Mariano Nov 24 '15 at 07:41
  • I checked it with a semaphore and I execute if the calls are completed, I know it's strange that it perfectly works here. If you really want I can send you the code, I tried everything. Whenever I change the replace pattern, it works, just not for the stripes. So I'm going to use the work around and only allow certain characters. – Auguste Van Nieuwenhuyzen Nov 24 '15 at 08:08
  • I also noticed that it only works for the detail object not for it's properties? I don't need the whole object to be stringified. – Auguste Van Nieuwenhuyzen Nov 24 '15 at 08:11
  • @Auguste I can imagine 2 possibilities: **1.** The *stripes* are a different char than what you pasted in the question (unlikely), or **2.** `detail.description` is undefined when you call it or `.description` is not a property, but a method... Since my solution is working for the particular case, I suggest asking a new question and let the community check your what's wrong, not just me. – Mariano Nov 24 '15 at 08:17
  • @Auguste `JSON.stringify` was an easy way to print the whole object here, just for testing purposes... Of course, in your script, you won't use that, you'd just refer to `detail.description` where needed – Mariano Nov 24 '15 at 08:19
  • I added a picture with the console output, the replace function doesn't seem to work with that symbol. If I put this `.replace(/ab/]/g, '**')`, it perfectly executes. – Auguste Van Nieuwenhuyzen Nov 24 '15 at 08:38
  • I don't know either but I have a work around so that's great for me, I appreciate all the time you put in it and gave me some valuable knowledge, thank you – Auguste Van Nieuwenhuyzen Nov 24 '15 at 09:11
  • @Auguste One last test: try `.replace(/\u25AC+\r?\n?/g, '')` – Mariano Nov 24 '15 at 09:22
  • That fixes it, thanks but there are so many characters I have to get out and it seems to take time to find the escaped values. At the moment I'm using this, `.replace(/[^a-zA-Z0-9- ,=?&:.'"@\u00C1\u00C0\u00C2\u00C4\u00C7\u00C8\u00C9\u00CA\u00CB\u00CC\u00CD\u00CE\u00CF\u00D1\u00D2\u00D3\u00D4\u00D5\u00D6\u00D9\u00DA\u00DB\u00DC\u00E1\u00E0\u00E2\u00E4\u00E7\u00E8\u00E9\u00EA\u00EB\u00EC\u00ED\u00EE\u00EF\u00F1\u00F2\u00F3\u00F4\u00F5\u00F6\u00F9\u00FA\u00FB\u00FC\/+\r?\n?]/g, '');` It works fine except for the `\r\n` chars. What you posted, filters that specific character. Thank you for that! – Auguste Van Nieuwenhuyzen Nov 30 '15 at 14:29
  • @AugusteVanNieuwenhuyzen aha! you couldn't paste the character because of your source encoding. Make sure you save your JS file as UTF-8 and define the same charset in HTTP headers (or wherever you're running this). You could simplify it using ranges in regex, e.g. `[\u0080-\u00FF]`... Or another approach: define the chars you want to keep, e.g. `[^\u0020-\u007E]` – Mariano Dec 01 '15 at 11:55
  • Yes my bad so I just put in ``?? :p Thank you for everything, helped me a lot for my end terms! – Auguste Van Nieuwenhuyzen Dec 01 '15 at 12:33
  • @Auguste Yes, and set UTF-8 in your text editor too... And test it... If it doesn't work, you can always go back to the `\uNNNN` notation :) – Mariano Dec 01 '15 at 12:42
  • It works, I put it in my html pages. So now I can do stuff like this `.replace(/─/g, '').replace(/▔/g, '').replace(/◙/g, '');` however it doesn't seem to work if I do this `.replace(/▬─▔◙/g, '');`. How do I put that in my text editor? Using Sublime Text – Auguste Van Nieuwenhuyzen Dec 01 '15 at 12:49
  • `/▬─▔◙/g` matches that specific sequence. Use a **[character class](http://www.regular-expressions.info/charclass.html)** `/[▬─▔◙]+/g` – Mariano Dec 01 '15 at 12:52
  • Thank you very much! There's one last problem, this character is for new lines: `↵` I can't seem to replace it for `\r\n` Maybe because my editor doesn't recognize it or because it is invisible in html? Would you know a solution to that? You really saved me already, so no problem if you don't feel like it. – Auguste Van Nieuwenhuyzen Dec 01 '15 at 14:08