1

I am looking to extract the value inside any HTML format (e.g., value ).

Here's is the example of HTML: <payment><no_code>88888</no_code><signature>fbf2c1e</signature><customer_no>4150</customer_no><amount>80</amount><admin>7</admin><total>87</total><transaction_code>0000894</transaction_code></payment>

The output I am trying to fetch is the value inside the admin, which is 7.

Any help would be appreciated!

Stein
  • 33
  • 4

2 Answers2

-1

We can use REGEXP_EXTRACT() here:

SELECT REGEXP_EXTRACT(html_string, '<admin>(.*?)</admin>', 1) AS admin
FROM yourTable;
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • [do not parse HTML with regex](https://stackoverflow.com/a/1732454/1067003). You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. The use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts. – hanshenrik Dec 23 '22 at 10:49
  • 1
    Thank you for you guys help. This one actually works. Since ExtractValue is not register in Presto – Stein Dec 23 '22 at 11:16
-1

use ExtractValue xpath, eg

SELECT ExtractValue(
'<payment><no_code>88888</no_code><signature>fbf2c1e</signature><customer_no>4150</customer_no><amount>80</amount><admin>7</admin><total>87</total><transaction_code>0000894</transaction_code></payment>',
 '/payment/admin'
)

=> 7

hanshenrik
  • 19,904
  • 4
  • 43
  • 89
  • Do not parse HTML with regex. – Tim Biegeleisen Dec 23 '22 at 10:54
  • @TimBiegeleisen this answer is parsing HTML with [XPath](https://en.wikipedia.org/wiki/XPath), not regex. XPath is literally designed to traverse XML. – hanshenrik Dec 23 '22 at 10:55
  • @TimBiegeleisen if you're unfamiliar with XPaths, you can think of XPath as an alternative to CSS selectors. (and at least MySQL supports XPath but not CSS selectors) – hanshenrik Dec 23 '22 at 10:59