0

I'm trying to do the following:

  1. Separate sentence on white space (spaces, tabs and newlines) and save it to an array
  2. This array should duplicate any word followed by any punctuation mark (meaning remove the punctuation mark and replace it with the word before it and it should only remove the last punctuation mark if there are multiple in a row)

for example:
arr1 = ["first" , "second," , "third"]
newArr = ["first" , "second" , "second" , "third"]

there is the code:

<!DOCTYPE html>
<html>
<body>
<h2>JavaScript</h2>

<p id="demo"></p>

<script>
var txt = "this is my text, that i want to fix";
var result = txt.split(/[ \t\n]+/);
var word = result.forEach(RemoveComma);

document.getElementById("demo").innerHTML = word;

function RemoveComma(value) {
  var words=[];
  var texts=' ';

        if(value.endsWith(".*\\p{Punct}"))
        {
            texts= value.replace(.*\\p{Punct}, "");
            words.push(texts);
            words.push(texts);
        }
        else
            words.push(value);
  } 
  return words;
}
</script>
</body>
</html>
  • Based on your criteria, shouldn't the resulting array be: `newArr = ["first", "first" , "second" , "second" , "third"]` since "first" is followed by a comma? – Scott Marcus Apr 26 '21 at 23:44
  • Many thanks! It works great, but can you look at my edit for the question? I need the code to work on all punctuation not only comma – Esraa Ismail Apr 27 '21 at 00:46
  • see https://stackoverflow.com/questions/7576945/javascript-regular-expression-for-punctuation-international , see also my answer below – Mister Jojo Apr 27 '21 at 01:03

3 Answers3

2

You have an extra closing } at the end of your function - but also, the result.forEach() loop is a function that has a return after each loop - rather than returning a single value after it's completed. It's better to have the forEach() loop inside your function, let it iterate and build the array, and then return the final value.

var txt    = "this is my text, that i want to fix";
var result = txt.split(/[ \t\n]+/);
var word   = RemoveComma(result);

document.getElementById("demo").innerHTML = word;

function RemoveComma(arr) {
  var words=[];
  arr.forEach( value => {
    var texts=' ';

          if(value.endsWith(","))
          {
              texts= value.replace(/,\s*$/, "");
              words.push(texts);
              words.push(texts);
          }
          else
              words.push(value);
            })
    return words;
  } 
<p id="demo"></p>
Kinglish
  • 23,358
  • 3
  • 22
  • 43
1

Here is another solution. I suggest the function should process the whole string and have a name according to it's functionality: to make a list of words duplicating those that precede a comma sign.

    const makeList = (txt) => {
        let result = [];
        for (let word of txt.split(/[\s\t\n]+/)) {
            if (word.endsWith(',')) {
                result.push(word.slice(0,-1))
                result.push(word.slice(0,-1))
            } else {
                result.push(word)
            }
        }
        return result;    
    }
    var txt = "this is my text, that I want to fix";
    var words = makeList(txt);
    console.log(words);

I used word.slice instead of regular expressions (or substring), just for fun of using other tools. I've just learned it today.

Notice the use of \s in the regular expression at split call, it's more clear and readable

Daniel Faure
  • 391
  • 6
  • 14
0

simply do

const
  txt = 'this is my text, that i want; to fix'
, res = txt
          .replaceAll(/[?.,;!¡¿。、·]/g, ',') // aAll punctuation signs
          .match(/\,|\w+/g)
          .reduce((a,c,i,{[i-1]:p})=>[...a,(c==',')?p:c],[]) 

console.log( JSON.stringify(res) )
.as-console-wrapper {max-height: 100%!important;top:0;}
Mister Jojo
  • 20,093
  • 6
  • 21
  • 40