1

Goal:

  1. Capture only the purchase date, amount, and purchased item name(s).
  2. Ignore all blank lines
  3. Ignore Reference # & SHIPPING AND TAX string
  4. Then, repeat this on the next grouping of purchases.
  • I am using Google Sheets for this project.

Sample data showing 3 purchases (ie blocks of data)

Note: spacing & SHIPPING AND TAX string varies inbetween

01/12 P934200QXEHMBPNAD Acme MARKETPLACE SEATTLE WA $34.96    Date & Amount
                                                              Ignore (blank line)
                           435852496957                       Ignore
                           BOSCH CO2135 1/8 In. x 2-3/4 I     Purchased item name
                           BOSCH CO2131 1/16 In. x 1-7/8      Purchased item name
                           IZOD Men's Memory Foam Slipper     Purchased item name
                           SHIPPING AND TAX                   Ignore
                                                              Ignore (blank line)
01/12 P934200QXEHMB6MQ0 Acme MARKETPLACE SEATTLE WA $48.91
                           492577232349
                           LxTek Remanufactured Ink Cartr
                           SHEENGO Remanufactured Ink Car
02/02 P934200AEHMB7E12  Acme MARKETPLACE SEATTLE WA $21.60

                           659473773469
                           LHXEQR Tubing Adapter for ResM
                           SHIPPING AND TAX


My updated attempt

=index(if(len(C26:C33),REGEXREPLACE(C26:C33,"(?Ums)(\d{2}/\d{2}) .* (\$\d{1,}\.\d{1,2}).(?:^\s+\d+$)(.*)(?:\s+SHIPPING AND TAX)","$1,$2,$3"),))

Unsuccessful results

01/12 P934200QXEHMBPNAD Acme MARKETPLACE SEATTLE WA $34.96

#VALUE!
BOSCH CO2135 1/8 In. x 2-3/4 I
BOSCH CO2131 1/16 In. x 1-7/8
IZOD Men's Memory Foam Slipper
SHIPPING AND TAX

01/12 P934200QXEHMB6MQ0 Acme MARKETPLACE SEATTLE WA $48.91
#VALUE!
LxTek Remanufactured Ink Cartr
SHEENGO Remanufactured Ink Car
02/02 P934200DJEHMB7E12 AMAZON MARKETPLACE SEATTLE WA $21.60

#VALUE!
LHXEQR Tubing Adapter for ResM
SHIPPING AND TAX



Unsuccessful
- it did not ignore:
data btwn date & amount
blank lines
SHIPPING AND TAX
- Value issue - not handling Reference # well
Function REGEXREPLACE parameter 1 expects text values. But '435848996957' is a number and cannot be coerced to a text.
  • 1
    What language/tool are you using? – Bohemian Feb 16 '23 at 01:31
  • Doing this in google sheets. C&P thes data into cells, then hope to run Reyes’s on that cell range – ManniStudio Feb 16 '23 at 03:39
  • it would help if you could share a sample sheet or screenshot showcasing how the said data is arranged in cells and expected output (manually entered) – rockinfreakshow Feb 16 '23 at 06:00
  • @rockinfreakshow - Agreed! I just revamped the post. I hope it is better. Look forward to your advice on the regex – ManniStudio Feb 17 '23 at 19:53
  • As per the [ask], please [**do not** post images of code, data, error message, etc](https://meta.stackoverflow.com/questions/285551). Instead, copy or type the text into your question, [formatted as code](https://meta.stackoverflow.com/questions/251361#251362). Reserve the use of images for diagrams or demonstrating rendering bugs; things that are impossible to describe accurately via text. – Bohemian Feb 17 '23 at 22:48
  • @Bohemian - Ah, thank you for that. Apologies, still learning. I've removed the do nots and will replace using 3 backticks surrounding manual entries.. I believe that should be ok. – ManniStudio Feb 18 '23 at 01:21
  • So `#VALUE!` may appear instead of the big number, and the intervening blank line is optional? – Bohemian Feb 18 '23 at 02:15
  • I'm stumped on the value error. I thought it may be about proper format for that particular Gsheets cell. So I changed it from Number> automatic to plain text. but same error. As for blank lines, none are optional & all should be ignored. – ManniStudio Feb 18 '23 at 03:48

1 Answers1

0

This matches one "block" (note: turn DOTALL flag on):

(\d\d\/\d\d).*(\$[\d.]+).(?:^\s+\d+$)(.*)SHIPPING AND TAX\n\n

Capturing:

  • the date as group 1
  • the amount as group 2
  • the product lines as group 3

See live demo.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • thank you for the suggestion. I tried it out & it works well on one block, but stumbles on larger data set. Also, I took ur regex and mod'd it w/out losing your approach, and my stumbled too. Well, i just re-vamped my post to make it more organized. Look forward to your advice on the regex – ManniStudio Feb 17 '23 at 19:56