3

hostA has MySQL (3306 port), hive (10000 port) and hive metastore (9083 port) installed and running. hostB has presto installed and running.

Goal is to get hostB to run presto which allows queries against hivemetastore on hostA.

Getting error below. /home/ec2-user/warehouse/contact does exist (and the table is partitioned) on local filesystem (not hdfs/s3) of hostA but does not exist on hostB, why is presto trying to look for hive partitions on localhost where presto runs (hostB) instead of on hostA (where hive metastore is)? Metastore connection is established as presto is able to list the tables on the metastore,.

presto-cli --debug --catalog hive --schema default
presto:default> show tables;
           Table
----------------------------
 account
 contact
(2 rows)

Query 20171102_122934_00012_x6ppj, FINISHED, 2 nodes
http://localhost:8080/query.html?20171102_122934_00012_x6ppj
Splits: 18 total, 18 done (100.00%)
CPU Time: 0.0s total,   615 rows/s, 18.8KB/s, 5% active
Per Node: 0.0 parallelism,     8 rows/s,   280B/s
Parallelism: 0.0
0:00 [8 rows, 250B] [17 rows/s, 560B/s]

presto:default> select * from contact;
Query 20171102_122943_00013_x6ppj failed: Partition location does not exist: file:/home/ec2-user/warehouse/contact
com.facebook.presto.spi.PrestoException: Partition location does not exist: file:/home/ec2-user/warehouse/contact
        at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:102)
        at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:41)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:145)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:140)
        at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:243)
        at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:92)
        at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:195)
        at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:45)
        at com.facebook.presto.hive.util.ResumableTasks.lambda$submit$1(ResumableTasks.java:33)
        at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)



cat config.properties
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
# discovery.uri=http://example.net:8080
discovery.uri=http://hostB:8080

cat hive.properties
connector.name=hive-hadoop2
hive.metastore.uri=thrift://hostA:9083



2017-11-02T06:52:30.585Z        INFO    main    com.facebook.presto.metadata.StaticCatalogStore -- Loading catalog etc/catalog/hive.properties --
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       PROPERTY                                           DEFAULT     RUNTIME                        DESCRIPTION
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       hive.allow-corrupt-writes-for-testing              false       false                          Allow Hive connector to write data even when data will likely be corrupt
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       hive.assume-canonical-partition-keys               false       false
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       hive.bucket-execution                              true        true                           Enable bucket-aware execution: only use a single worker per bucket
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       hive.bucket-writing                                true        true                           Enable writing to bucketed tables
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs.connect.max-retries                       5           5
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs.connect.timeout                           500.00ms    500.00ms
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs-timeout                                   60.00s      60.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.domain-compaction-threshold                   100         100                            Maximum ranges to allow in a tuple domain without compacting it
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs.domain-socket-path                        null        null
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.fs.cache.max-size                             1000        1000                           Hadoop FileSystem cache size
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.force-local-scheduling                        false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.hdfs.authentication.type                      NONE        NONE                           HDFS authentication type
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.hdfs.impersonation.enabled                    false       false                          Should Presto user be impersonated when communicating with HDFS
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.compression-codec                             GZIP        GZIP
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore.authentication.type                 NONE        NONE                           Hive Metastore authentication type
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.storage-format                                RCBINARY    RCBINARY
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.immutable-partitions                          false       false                          Can new data be inserted into existing partitions or existing unpartitioned tables
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs.ipc-ping-interval                         10.00s      10.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-concurrent-file-renames                   20          20
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-initial-split-size                        32MB        32MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-initial-splits                            200         200
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-refresh-max-threads                 100         100
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-outstanding-splits                        1000        1000
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore.partition-batch-size.max            100         100
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-partitions-per-scan                       100000      100000                         Maximum allowed partitions for a single table scan
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-partitions-per-writers                    100         100                            Maximum number of partitions per writer
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-split-iterator-threads                    1000        1000
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-split-size                                64MB        64MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-cache-maximum-size                  10000       10000
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-cache-ttl                           0.00s       0.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-refresh-interval                    0.00s       0.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore.thrift.client.socks-proxy           null        null
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-timeout                             10.00s      10.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore.partition-batch-size.min            10          10
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.bloom-filters.enabled                     false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.default-bloom-filter-fpp                  0.05        0.05                           ORC Bloom filter false positive probability
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.max-buffer-size                           8MB         8MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.max-merge-distance                        1MB         1MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.max-read-block-size                       16MB        16MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.optimized-writer.enabled                  false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.stream-buffer-size                        8MB         8MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.parquet-optimized-reader.enabled              false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.parquet-predicate-pushdown.enabled            false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.per-transaction-metastore-cache-maximum-size  1000        1000
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.rcfile-optimized-writer.enabled               true        true
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.rcfile.writer.validate                        false       false                          Validate RCFile after write by re-reading the whole file
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.recursive-directories                         false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.config.resources                              null        null
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.respect-table-format                          true        true                           Should new partitions be written using the existing table format or the default Presto format
2017-11-02T06:52:31.310Z        INFO    main    Bootstrap       hive.skip-deletion-for-alter                       false       false                          Skip deletion of old partition data when a partition is deleted and then inserted in the same transaction
2017-11-02T06:52:31.310Z        INFO    main    Bootstrap       hive.table-statistics-enabled                      true        true                           Enable use of table statistics
2017-11-02T06:52:31.310Z        INFO    main    Bootstrap       hive.time-zone                                     Zulu        Zulu
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.orc.use-column-names                          false       false                          Access ORC columns using names from the file
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.parquet.use-column-names                      false       false                          Access Parquet columns using names from the file
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.dfs.verify-checksum                           true        true
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.write-validation-threads                      16          16                             Number of threads used for verifying data after a write
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.non-managed-table-writes-enabled              false       false                          Enable writes to non-managed (external) tables
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.pin-client-to-current-region               false       false                          Should the S3 client be pinned to the current EC2 region
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.aws-access-key                             null        null
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.aws-secret-key                             [REDACTED]  [REDACTED]
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.connect-timeout                            5.00s       5.00s
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.encryption-materials-provider              null        null                           Use a custom encryption materials provider for S3 data encryption
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.endpoint                                   null        null
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.kms-key-id                                 null        null                           Use an AWS KMS key for S3 data encryption
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-backoff-time                           10.00m      10.00m
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-client-retries                         5           5
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-connections                            500         500
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-error-retries                          10          10
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-retry-time                             10.00m      10.00m
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.multipart.min-file-size                    16MB        16MB                           Minimum file size for an S3 multipart upload
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.multipart.min-part-size                    5MB         5MB                            Minimum part size for an S3 multipart upload
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.signer-type                                null        null
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.socket-timeout                             5.00s       5.00s
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.sse.enabled                                false       false                          Enable S3 server side encryption
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.sse.kms-key-id                             null        null                           KMS Key ID to use for S3 server-side encryption with KMS-managed key
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.sse.type                                   S3          S3                             Key management type for S3 server-side encryption (S3 or KMS)
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.ssl.enabled                                true        true
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.staging-directory                          /tmp        /tmp                           Temporary directory for staging files before uploading to S3
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.use-instance-credentials                   true        true
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.user-agent-prefix                                                                     The user agent prefix to use for S3 calls
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.metastore.uri                                 null        [thrift://hostA:9083]  Hive metastore URIs (comma separated)
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.metastore                                     thrift      thrift
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-add-column                              false       false                          Allow Hive connector to add column
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-drop-column                             false       false                          Allow Hive connector to drop column
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-drop-table                              false       false                          Allow Hive connector to drop table
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-rename-column                           false       false                          Allow Hive connector to rename column
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-rename-table                            false       false                          Allow Hive connector to rename table
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.security                                      legacy      legacy
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap
2017-11-02T06:52:32.663Z        INFO    main    com.facebook.presto.metadata.StaticCatalogStore -- Added catalog hive using connector hive-hadoop2 --
Dennis Jaheruddin
  • 21,208
  • 8
  • 66
  • 122
tooptoop4
  • 234
  • 3
  • 15
  • 45

0 Answers0