Questions tagged [tez]

The Apache Tez™ project is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data. It is currently built atop Apache Hadoop YARN.

The 2 main design themes for Tez are:

Empowering end users by: Expressive dataflow definition APIs Flexible Input-Processor-Output runtime model Data type agnostic Simplifying deployment Execution Performance Performance gains over Map Reduce Optimal resource management Plan reconfiguration at runtime Dynamic physical data flow decisions

[For more details visit this link][1]: https://tez.apache.org/

94 questions
9
votes
1 answer

Why would someone run Spark / Flink on Tez?

In the Tez paper from Saha et al., the following modular architecture of Hadoop 2 with Tez is shown: Why would someone run Spark/Flink on Tez? What are the advantages? Better utilization of YARN?
j9dy
  • 2,029
  • 3
  • 25
  • 39
7
votes
1 answer

Hive - How to know which execution engine I am currently using

I want to automate my hive ETL workflow in such a way that I need to execute hive jobs on the basis of execution engine (Tez or MR) because of memory constraints. Would you please help, as I wanted to cross-check in-between of my whole…
Indrajeet Gour
  • 4,020
  • 5
  • 43
  • 70
7
votes
2 answers

Hive query too slow and failed

I've execute a "group by" query in Hive txt table select day,count(*) from mts_order where source="MTS_REG_ORDER" group by day; but it shows: Error: Error while processing statement: FAILED: Execution Error, return code 2 from…
Alexis
  • 1,080
  • 3
  • 21
  • 44
7
votes
3 answers

Tez execution engine vs Mapreduce Execution Engine in Hive

What is the difference between Tez engine and Map Reduce engine in Hive and in which process which engine is better to use (for eg:joins, aggregation?)
Bitanshu Das
  • 627
  • 2
  • 8
  • 21
6
votes
1 answer

AWS Data Pipeline: Tez fails on simple HiveActivity

I'm trying to run simple AWS Data Pipeline for my POC. The case that I have is following: get data from CSV stored on S3, perform simple hive query on them and put results back to S3. I've created very basic pipeline definition and tried to run it…
6
votes
2 answers

can not see any dags in Tez ui

I can run hive on Tez,But can not see any job in tez ui. And it will drive me Crazy! and the user and name are null in timelineserver the config is blow: tez-site.xml tez.history.logging.service.class
leocook
  • 191
  • 1
  • 13
5
votes
4 answers

Hive execute "insert into ... values ..." very slow

I build a hadoop & hive cluster and try to do some test. But it's really slow. table table value_count +--------------------------------------------------------------+--+ | createtab_stmt …
Alexis
  • 1,080
  • 3
  • 21
  • 44
5
votes
1 answer

How to tune hive to query metadata?

In case I am running a below hive query on table with certain partitioned column, I want to make sure hive does not do full table scan and just figure out the result from meta data itself. Is there any way to enable this ? Select…
KBR
  • 464
  • 1
  • 7
  • 24
5
votes
4 answers

How do I increase Tez's container physical memory?

I've been running some hive scripts on an aws emr 4.8 cluster with hive 1.0 and tez 0.8. My configurations look like this: SET hive.exec.compress.output=true; SET mapred.output.compression.type=BLOCK; SET hive.exec.dynamic.partition = true; SET…
jackStinger
  • 2,035
  • 5
  • 23
  • 36
3
votes
1 answer

Hive - Select count(*) not working with Tez with but works with MR

I have a Hive external table with parquet data. When I run select count(*) from table1, it fails with Tez. But when execution engine is changed to MR it works. Any idea why it's failing with Tez? I'm getting the following error with Tez: Error:…
kunrazor
  • 341
  • 1
  • 4
  • 10
3
votes
1 answer

Is really Hive on Tez with ORC performance better than Spark SQL for ETL?

I have little experience in Hive and currently learning Spark with Scala. I am curious to know whether Hive on Tez really faster than SparkSQL. I searched many forums with test results but they have compared older version of Spark and most of them…
user3150024
  • 139
  • 3
  • 14
3
votes
1 answer

Hive on Tez doesn't work in Spark 2

when working with HDP 2.5 with spark 1.6.2 we used Hive with Tez as its execution engine and it worked. But when we moved to HDP 2.6 with spark 2.1.0, Hive didn't work with Tez as its execution engine, and the following exception was thrown when the…
Elad Eldor
  • 803
  • 1
  • 12
  • 22
3
votes
0 answers

Issue with Apache Tez configuration

I want to configure apache tez with apache hadoop. But getting below issue ... Anyone can suggest me how i can resolve this issue. Caused by: org.apache.tez.dag.api.TezUncheckedException: Invalid configuration of tez jars, tez.lib.uris is not…
Brijesh Mishra
  • 169
  • 2
  • 7
2
votes
1 answer

hive on tez throws "No LLAP Daemons are running" ERROR

I have a LLAP service runing on yarn cluster on Amazon EMR. Here is the image showing that the llap service is on, and it's name was llap_service: And I've set the "hive.llap.daemon.service.hosts" to "@llap_service", but my query in hive could not…
Harper
  • 81
  • 7
2
votes
2 answers

Hive query failed on Tez DAG did not succeed due to VERTEX_FAILURE

I have a basic setup of Ambari 2.5.3 and HDP 2.6.3 and tried to run some simple queries below. I don't understand why it failed. Can you help? [root@demo demo]# beeline Beeline version 1.2.1000.2.6.3.0-235 by Apache Hive beeline> !connect…
HP.
  • 19,226
  • 53
  • 154
  • 253
1
2 3 4 5 6 7