7

I have a json table which was created by

CREATE TABLE `normaldata_source`(
  `column1` int, 
  `column2` string, 
  `column3` struct<column4:string>)

A sample data is:

{
  "column1": 9,
  "column2": "Z",
  "column3": {
    "column4": "Y"
  }
}

If I do

SELECT column3
FROM normaldata_source

it will produce a result {column4=y}. However, I want it to be in json form {"column4": "y"}

Is this possible?

*Edit This query gives me the following result:

SELECT CAST(column3 AS JSON) as column3_json
FROM normaldata_source

result

Manfred Moser
  • 29,539
  • 13
  • 92
  • 123
lclankyo
  • 221
  • 3
  • 10
  • 1
    There are some workaround (e.g. building string with format function), support for casting to json with field names is ongoing in https://github.com/trinodb/trino/pull/3613. – ebyhr Dec 28 '20 at 01:14
  • Thanks for the info. Unfortunately, my real object has many fields and is nested, so building the string with format function will be quite tedious. I was hoping there there would be an easier way. The link is for trino, but is that the same thing as presto? – lclankyo Dec 28 '20 at 03:25
  • 1
    Yes, https://trino.io/blog/2020/12/27/announcing-trino.html is the relevant blog post. – ebyhr Dec 28 '20 at 05:16

2 Answers2

4

As of Trino 357 (formerly known as Presto SQL), you can now cast a row to JSON and it will preserve the column names:

WITH normaldata_source(column1, column2, column3) AS (
    VALUES (9, 'Z', cast(row('Y') as row(column4 varchar)))
)
SELECT cast(column3 as json) 
FROM normaldata_source

=>

      _col0
-----------------
 {"column4":"Y"}
(1 row)
Martin Traverso
  • 4,731
  • 15
  • 24
1

I encountered this same problem and was thoroughly stumped on how to proceed in light of deep compositional nesting/structs. I'm using Athena (managed Presto w/ Hive Connector from AWS). In the end, I worked around it by doing a CTAS (create table as select) where I selected the complex column I wanted, under the conditions I wanted) and wrote it to an external table with an underlying SerDe format of JSON. Then, via the HiveConnector's $path magic column (or by listing the files under the external table location), I obtain the resulting files and streamed out of those.

I know this isn't a direct answer to the question at hand - I believe we have to wait for https://github.com/trinodb/trino/pull/3613 in order to support arbitrary struct/array compositions -> json. But maybe this will help someone else who kind of assumed they'd be able to do this.

Although I originally saw this as an annoying workaround, I'm now starting to think it was the right call for my application anyway

kylejmcintyre
  • 1,898
  • 2
  • 17
  • 19
  • Would it be possible to share a reproducible example of what you implemented? I'm totally stuck on this issue right now! – geotheory Apr 20 '21 at 22:43