0

I was using a map to trying to clean data send to an API that has sentence parser the problem it really breaks when finding a \n or \t so chose to replace for a time:

const allBulletPoints = Array.from(document.querySelectorAll('ul,ol'));
const allBulletPointsText = allBulletPoints.map((element) =>
    element.textContent
        .split(/(\t)|(\n)/g)
        .filter((element) => element && !element.match(/(\t)|(\n)/gi))
);

console.log(allBulletPointsText);
/*
Result:
[   
    [
        "        Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
        "        Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
        "        Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
        "    "
    ],
    [
        "        Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
        "        Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
        "        Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
        "    "
    ]
]

I need to be like this:
[
    "Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
    "Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
    "Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
    "Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
    "Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
    "Lorem ipsum dolor sit amet consectetur, adipisicing elit", 
]
*/
<ul>
    <li>Lorem ipsum dolor sit amet consectetur, adipisicing elit</li>
    <li>Lorem ipsum dolor sit amet consectetur, adipisicing elit</li>
    <li>Lorem ipsum dolor sit amet consectetur, adipisicing elit</li>
</ul>
<ol>
    <li>Lorem ipsum dolor sit amet consectetur, adipisicing elit</li>
    <li>Lorem ipsum dolor sit amet consectetur, adipisicing elit</li>
    <li>Lorem ipsum dolor sit amet consectetur, adipisicing elit</li>
</ol>

But right now I want to make to do a little more clean using substr or the split and add all elements seperated then and add to the same array.

Any idea how to get better solution using this functions?

2 Answers2

0

Like already mentioned in comments you could trim strings in your first, before doing split.

element.innerText
       .trim()
       .split(/(\t)|(\n)/g)
       .filter((element) => element && !element.match(/(\t)|(\n)/gi))
Andrew Ymaz
  • 1,863
  • 8
  • 12
0
  1. The element.match(/(\t)|(\n)/gi) doesn't do anything since split already removed all \t and \n.
  2. Instead of trimming each line separately, you could build it into the splitting like split(/\s*?[\t\n]\s*/g). Or just use match instead to just match what you're interested in.
  3. Concatenating the arrays can be done with Array#flat().

So this might be what you're looking for:

const allBulletPointsText = Array.from(
       document.querySelectorAll('ul,ol'),
       e => e.textContent.match(/\S[^\t\n]*\S|\S/g) || []
// or: e => e.textContent.trim().split(/\s*?[\t\n]\s*/g).filter(e => e)
// or: e => e.textContent.split(/^\s+|\s*?[\t\n]\s*|\s+$/g).filter(e => e)
// or: e => e.textContent.split(/[\t\n]+/g).map(e => e.trim()).filter(e => e)
).flat();
Robert
  • 2,603
  • 26
  • 25