5

I am trying out S3 Select from Presto using hive connector and Minio Object store. I am able to create an external table and run all the SQL queries. But, S3 Select does not seem to be working, even with the hive.s3select-pushdown.enabled=true set in the properties file in the catalog folder. I ran a packet trace on the Minio server, I only see GET/LIST calls being made, do not see any POST /{Key+}?select&select-type=2 HTTP/1.1 being made.

Below is the hive properties file.

hive.metastore.uri=thrift://hadoop-master:9083
hive.s3.path-style-access=true
hive.s3.endpoint=http://X.X.X.X:9000
hive.s3.aws-access-key=minioadmin
hive.s3.aws-secret-key=minioadmin
hive.non-managed-table-writes-enabled=true
hive.storage-format=ORC
hive.s3select-pushdown.enabled=true

I see that the same is set from the SESSION parameters in presto.

minio.s3_select_pushdown_enabled   | true                | true                
minio.projection_pushdown_enabled  | true                | true

This is how I am creating the external table from presto cli.

presto:default> CREATE TABLE nyc_9 ( vendorid VARCHAR, tpep_pickup_datetime VARCHAR, tpep_dropoff_datetime VARCHAR, passenger_count VARCHAR, trip_distance VARCHAR, ratecodeid VARCHAR, store_and_fwd_flag VARCHAR, pulocationid VARCHAR, dolocationid VARCHAR, payment_type VARCHAR, fare_amount VARCHAR, extra VARCHAR, mta_tax VARCHAR, tip_amount VARCHAR, tolls_amount VARCHAR, improvement_surcharge VARCHAR, total_amount VARCHAR) WITH (FORMAT = 'CSV', skip_header_line_count = 1, EXTERNAL_LOCATION = 's3a://test10gb5/');

Query being run

presto:default> SELECT * FROM nyc_9 WHERE trip_distance > '20' AND fare_amount > '10' AND tip_amount > '2' AND passenger_count = '2' LIMIT 10;
 vendorid |  tpep_pickup_datetime  | tpep_dropoff_datetime  | passenger_count | trip_distance | ratecodeid | store_and_fwd_flag | pulocationid | dolocationid | payment_type | fare_amount | extra | mta_tax | tip_amount | tolls_amount | improvement_sur
----------+------------------------+------------------------+-----------------+---------------+------------+--------------------+--------------+--------------+--------------+-------------+-------+---------+------------+--------------+----------------
 2        | 04/26/2018 08:51:16 AM | 04/26/2018 09:42:03 AM | 2               | 5.06          | 1          | N                  | 236          | 170          | 1            | 31          | 0     | 0.5     | 6.36       | 0            | 0.3            
 2        | 04/26/2018 08:14:17 AM | 04/26/2018 08:35:08 AM | 2               | 6.88          | 1          | N                  | 263          | 45           | 1            | 22          | 0     | 0.5     | 6.84       | 0            | 0.3            
 1        | 04/26/2018 08:19:47 AM | 04/26/2018 09:17:45 AM | 2               | 9.7           | 1          | N                  | 138          | 144          | 1            | 39          | 0     | 0.5     | 8          | 0            | 0.3            
 2        | 04/26/2018 08:38:15 AM | 04/26/2018 09:09:58 AM | 2               | 4.73          | 1          | N                  | 142          | 144          | 1            | 22          | 0     | 0.5     | 4.56       | 0            | 0.3            
 2        | 04/26/2018 08:38:26 AM | 04/26/2018 09:22:12 AM | 2               | 5.95          | 1          | N                  | 239          | 13           | 1            | 29          | 0     | 0.5     | 2.98       | 0            | 0.3            
 2        | 04/26/2018 08:47:03 AM | 04/26/2018 09:17:02 AM | 2               | 3.27          | 1          | N                  | 158          | 162          | 1            | 19          | 0     | 0.5     | 3.96       | 0            | 0.3            
 2        | 04/26/2018 08:21:19 AM | 04/26/2018 08:46:55 AM | 2               | 3.89          | 1          | N                  | 262          | 107          | 1            | 18.5        | 0     | 0.5     | 3.86       | 0            | 0.3            
 2        | 04/26/2018 08:35:32 AM | 04/26/2018 09:01:54 AM | 2               | 4.09          | 1          | N                  | 236          | 137          | 1            | 17.5        | 0     | 0.5     | 3.66       | 0            | 0.3            
 1        | 04/26/2018 08:43:45 AM | 04/26/2018 09:03:41 AM | 2               | 3             | 1          | N                  | 163          | 145          | 1            | 15          | 0     | 0.5     | 6          | 0            | 0.3            
 1        | 04/26/2018 08:01:47 AM | 04/26/2018 08:13:08 AM | 2               | 3.1           | 1          | N                  | 264          | 137          | 1            | 12          | 0     | 0.5     | 2.55       | 0            | 0.3            
(10 rows)

Is there anything else that needs to be done for S3 Select to work?

0 Answers0