1

I'm trying to extract information from Gmail into Google Spreadsheet. The information in the email has a table structure with the following columns List of Products, QTY Sold and the Subtotal for each product. These repeat N times.

When accesing the information using message.getPlainBody() I get the following text:


Product
Quantity
Price
Chocolate
1
$8.58
Apples
2
$40.40
Bananas
1
$95.99
Candy
1
$4.99
Subtotal:
$149.96

Progress

First I tried to use a regular expression to identify each row with all his elements:

  • Product name: Any amount of characters that don't include ':' (.*)[^:]
  • QTY Sold: Any number \d*
  • Anything that looks like a SubTotal [$]\d*.\d*

Wrapping everything up it looks like this

    function ExtractDetail(message){
      var mainbody = message.getPlainBody();

     //RegEx
     var itemListRegex = new RegExp(/(.*)[^:][\r\n]+(\d*[\r\n]+[$](\d*\.\d*)[\r\n]+/g);
     var itemList = mainbody.match(itemListRegex);
     Logger.log(itemList);
    }

And so far it works:

enter image description here

itemList: Chocolate 1 $8.58 ,Apples 2 $40.40 ,Bananas 1 $95.99 ,Candy 1 $4.99

However, I'm getting the following result:

  • [Chocolate 1 $8.58]
  • [Apples 2 $40.40]
  • [Bananas 1 $95.99]
  • [Candy 1 $4.99]

Instead of:

  • [Chocolate] [ 1 ] [$8.58]
  • [Apples] [ 2 ] [$40.40]
  • [Bananas] [ 1 ] [$95.99]
  • [Candy] [ 1 ] [$4.99]

Question

My question is, how can I append a new row in a way that it each row corresponds to each match found and that each column corresponds to each property?

How do I turn the result of each match into an array? Is it possible or should I change my approach?

Update:

Since the result of my current attemp is a large string I'm trying to find other options. This one poped up:

var array = Array.from(mainbody.matchAll(itemListRegex), m => m[1]);

Source: How do you access the matched groups in a JavaScript regular expression?

I'm still working on it. I still need to find how to add more columns and for some reason it starts on 'Apples' (following the examples), leaving 'Chocolates' behind.

Log:

Logger.log('array: ' + array);

enter image description here

TheMaster
  • 45,448
  • 6
  • 62
  • 85
  • If I misunderstood your question, I apologize. Can you confirm whether your script (especially, it's the regex.) in your question is correct, again? I thought that you might have not correctly copied it. – Tanaike Apr 04 '20 at 00:58
  • @Tanaike Hi Tanaike. It is correct, at least it finds what it is asked for. The main issue is to put the info found into a 2D array. I will post an update in a moment. – Luis Alberto Delgado de la Flo Apr 04 '20 at 01:15
  • Thank you for replying. Your script in your question is correct. Is my understanding correct? And, I think that when you provide the sample values of `itemList` as the text values, it will help users think of the solution. Because in your case, the values are in an image. – Tanaike Apr 04 '20 at 01:25
  • Updated! The script is partially correct. The objective is to take the results from the RegEx and write them down in Sheets as a table. At this point what I can get is just a long string and I'm trying to convert that into a 2D array somehow. – Luis Alberto Delgado de la Flo Apr 04 '20 at 01:28
  • 1
    What Tanaike said was that the regex is invalid – TheMaster Apr 04 '20 at 04:31

2 Answers2

2

Map and split the resulting array by \new lines:

const data = `Product
Quantity
Price
Chocolate
1
$8.58
Apples
2
$40.40
Bananas
1
$95.99
Candy
1
$4.99
Subtotal:
$149.96`;

const itemListRegex = /.*[^:][\r\n]+\d*[\r\n]+\$\d*\.\d*(?=[\r\n]+)/g;
const itemList = data.match(itemListRegex);

console.info(itemList.map(e => e.split(/\n/)));//map and split
TheMaster
  • 45,448
  • 6
  • 62
  • 85
2

If you want to use matchAll like Array.from(mainbody.matchAll(itemListRegex), m => m[1]), how about this modification?

In this case, /(.*[^:])[\r\n]+(\d*)[\r\n]+([$]\d*\.\d*)[\r\n]/g is used as the regex.

Modified script:

const itemListRegex = /(.*[^:])[\r\n]+(\d*)[\r\n]+([$]\d*\.\d*)[\r\n]/g;
var array = Array.from(mainbody.matchAll(itemListRegex), ([,b,c,d]) => [b,Number(c),d]);

Result:

[
  ["Chocolate",1,"$8.58"],
  ["Apples",2,"$40.40"],
  ["Bananas",1,"$95.99"],
  ["Candy",1,"$4.99"]
]

Test of script:

const mainbody = `
Product
Quantity
Price
Chocolate
1
$8.58
Apples
2
$40.40
Bananas
1
$95.99
Candy
1
$4.99
Subtotal:
$149.96
`;

const itemListRegex = /(.*[^:])[\r\n]+(\d*)[\r\n]+([$]\d*\.\d*)[\r\n]/g;
var array = Array.from(mainbody.matchAll(itemListRegex), ([,b,c,d]) => [b,Number(c),d]);
console.log(array)

Note:

  • About how can I append a new row in a way that it each row corresponds to each match found and that each column corresponds to each property?, this means for putting the values to Spreadsheet? If it's so, can you provide a sample result you expect?

References:

halfer
  • 19,824
  • 17
  • 99
  • 186
Tanaike
  • 181,128
  • 11
  • 97
  • 165