17

Let's say I have an array of strings, and I need specific info from them, what would be an easy way to do that?

Suppose the array is this:

let infoArr = [
  "1 Ben Howard 12/16/1988 apple",
  "2 James Smith 1/10/1999 orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/1975 peach",
  "5 Doug Jones 11/10/1975 peach"
];

Let's say I want to extract the date and save it into another array, well I could make a function like this

function extractDates(arr)
{
  let dateRegex = /(\d{1,2}\/){2}\d{4}/g, dates = "";
  let dateArr = [];

  for(let i = 0; i<arr.length; i++)
  {
    dates = /(\d{1,2}\/){2}\d{4}/g.exec(arr[i])
    dates.pop();
    dateArr.push(dates);
  }

  return dateArr.flat();
}

Although this works, it is clunky and requires pop() because it will return an array of arrays, ie: ["12/16/1988", "16/"], plus I need to call flat() afterwards.

Another option would be to substring the strings, with a given position, where I need to know a regex pattern.

function extractDates2(arr)
{
  let dates = [];

  for(let i = 0; i<arr.length; i++)
  {
    let begin = regexIndexOf(arr[i], /(\d{1,2}\/){2}\d{4}/g);
    let end = regexIndexOf(arr[i], /[0-9] /g, begin) + 1;
    dates.push(arr[i].substring(begin, end));
  }

  return dates;
 }    

And of course it uses the next regexIndexOf() function:

function regexIndexOf(str, regex, start = 0)
{
  let indexOf = str.substring(start).search(regex);
  indexOf = (indexOf >= 0) ? (indexOf + start) : -1;
  return indexOf;
}

Again this function also works, but it seems too awful to accomplish the extraction of something simple. Is there an easier way to extract data into an array?

Shidersz
  • 16,846
  • 2
  • 23
  • 48
Travis
  • 1,674
  • 1
  • 9
  • 14

4 Answers4

21

One approach could be using map() over the elements of the array applying the match on each element, and finally call flat() to get the desired result:

let infoArr = [
  "1 Ben Howard 12/16/1988 apple",
  "2 James Smith 1/10/1999 orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/1975 peach",
  "5 Doug Jones 11/10/1975 peach"
];

const result = infoArr.map(o => o.match(/(\d{1,2}\/){2}\d{4}/g)).flat();

console.log(result);

Alternatively, you could use flatMap():

let infoArr = [
  "1 Ben Howard 12/16/1988 apple",
  "2 James Smith 1/10/1999 orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/1975 peach",
  "5 Doug Jones 11/10/1975 peach"
];

const result = infoArr.flatMap(o => o.match(/(\d{1,2}\/){2}\d{4}/g));

console.log(result);

Also, if you need to remove null values from the final array in the case there are strings without dates, you can apply filter(), like this:

const result = infoArr.map(o => o.match(/(\d{1,2}\/){2}\d{4}/g))
                      .flat()
                      .filter(date => date !== null);

const result = infoArr.flatMap(o => o.match(/(\d{1,2}\/){2}\d{4}/g))
                      .filter(date => date !== null);

An example with conflicting data:

let infoArr = [
  "1 Ben Howard 12/16/1988 apple 10/22/1922",
  "2 James Smith orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/19075 peach",
  "5 Doug Jones 11/10-1975 peach"
];

const result = infoArr.flatMap(o => o.match(/(\d{1,2}\/){2}\d{4}/g))
                      .filter(date => date !== null); /* or filter(date => date) */

console.log(result);

Alternative without flat():

Since flat() and flatMap() are still currently "experimental", subject to change, and some browser (or versions) don't support it, you can use next alternative with the limitation that will only get the first match on every string:

const infoArr = [
  "1 Ben Howard 12/16/1988 apple 10/22/1922",
  "2 James Smith orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/19075 peach",
  "5 Doug Jones 11/10-1975 peach"
];

const getData = (input, regexp, filterNulls) =>
{
    let res = input.map(o =>
    {
        let matchs = o.match(regexp);
        return matchs && matchs[0];
    });

    return filterNulls ? res.filter(Boolean) : res;
}

console.log(getData(infoArr, /(\d{1,2}\/){2}\d{4}/g, false));
console.log(getData(infoArr, /(\d{1,2}\/){2}\d{4}/g, true));
Shidersz
  • 16,846
  • 2
  • 23
  • 48
19

One option would be to join the strings by a separator that won't be matched, like ,, then just perform the global match to get an array of dates from it:

let infoArr = [
  "1 Ben Howard 12/16/1988 apple",
  "2 James Smith 1/10/1999 orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/1975 peach",
  "5 Doug Jones 11/10/1975 peach"
];
const result = infoArr
  .join(',')
  .match(/(\d{1,2}\/){2}\d{4}/g);
console.log(result);
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
  • 1
    Works great, and is very concise and easy to reason about. – Travis Jan 02 '19 at 03:21
  • 4
    This solution appears to be the quickest: http://jsben.ch/w9geK additionally it has the advantage that it handles array elements without dates (doesn't create null values in the array), do keep in mind however that if you are trying to get the date of a specific element by its' index based on the original array then it may not line up if some elements don't have dates – Henry Howeson Jan 02 '19 at 03:31
  • 1
    This doesn't work when the `infoArr` is empty or none of the strings contains a date, as then `match()` returns `null` instead of an array. There's no reason to use `join` here, `map` or `flatMap` are much more reasonable. – Bergi Jan 02 '19 at 12:49
3

Although this works, it is clunky and requires pop() because it will return an array of arrays, ie: ["12/16/1988", "16/"], plus I need to call flat afterwards.

The regex exec method always has its match in the 0 property (assuming that it matches at all), you can just access that and push it to your array:

let infoArr = [
  "1 Ben Howard 12/16/1988 apple",
  "2 James Smith 1/10/1999 orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/1975 peach",
  "5 Doug Jones 11/10/1975 peach"
];

function extractDates(arr){
  const dateRegex = /(\d{1,2}\/){2}\d{4}/g;
  const dateArr = [];
  for (const str of arr){
    const date = /(\d{1,2}\/){2}\d{4}/g.exec(str);
    dateArr.push(date[0]);
  }
  return dateArr;
}

console.log(extractDates(infoArr));

(of course you could also do the same in a map callback)

scraaappy
  • 2,830
  • 2
  • 19
  • 29
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
1

You can use reduce() rather than the loops to pair down the code. Just be careful to keep the null out of the array if there is no match.

let infoArr = [
    "1 Ben Howard 12/16/1988 apple",
    "2 James Smith 1/10/1999 orange",
    "3 Andy Bloss 10/25/1956 apple",
    "4 Carrie Walters 8/20/1975 peach",
    "5 Doug Jones 11/10/1975 peach"
  ];
  
let regex = /(\d{1,2}\/){2}\d{4}/g
let dates =  infoArr.reduce((arr, s) => arr.concat(s.match(regex) || []) , [])
console.log(dates)
Mark
  • 90,562
  • 7
  • 108
  • 148