I have two tables. Table 1 is a large one, and Table 2 is a small one. I would like to extract data from Table 1 if values in Table1.column1
matches those in Table2.column1
. Both Table 1 and Table 2 have column, column1
. Here is my code.
select *
from Table1
where condition1
and condition2
and column1 in (select column1 from Table2)
Condition 1 and Condition 2 are meant to restrict the size of the table to be extracted. Not sure if this actually works. Then I got execution error, return code 1
. I am on Hue platform.
EDIT
As suggested by @yammanuruarun, I tried the following code.
SELECT *
FROM
(SELECT *
FROM Table1
WHERE condition1
AND condition2) t1
INNER JOIN Table2 ON t1.column1 = t2.column1
Then, I got the following error.
Error while processing statement: FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask. Application
application_1580875150091_97539 failed 2 times due to AM Container for
appattempt_1580875150091_97539_000002 exited with exitCode: 255 Failing this
attempt.Diagnostics: [2020-02-07 14:35:53.944]Exception from container-launch.
Container id: container_e1237_1580875150091_97539_02_000001 Exit code: 255
Exception message: Launch container failed Shell output: main : command provided 1
main : run as user is hive main : requested yarn user is hive Getting exit code
file... Creating script paths... Writing pid file... Writing to tmp file /disk-
11/hadoop/yarn/local/nmPrivate/application_1580875150091_97539/container_e1237_1580875150091_97539_02_000001/container_e1237_1580875150091_97539_02_000001.pid.tmp
Writing to cgroup task files... Creating local dirs... Launching container...
Getting exit code file... Creating script paths... [2020-02-07 14:35:53.967]Container exited with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr :
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in
thread "IPC Server idle connection scanner for port 26888" Halting due to Out Of
Memory Error... Halting due to Out Of Memory Error... Halting due to Out Of Memory
Error...
Halting due to Out Of Memory Error... Halting due to Out Of Memory Error...
Halting due to Out Of Memory Error... Halting due to Out Of Memory Error...
Halting due to Out Of Memory Error... [2020-02-07 14:35:53.967]Container exited
with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr :
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in
thread "IPC Server idle connection scanner for port 26888" Halting due to Out Of Memory Error... Halting due to Out Of Memory Error...
Halting due to Out Of Memory Error... Halting due to Out Of Memory Error...
Halting due to Out Of Memory Error... Halting due to Out Of Memory Error...
Halting due to Out Of Memory Error... Halting due to Out Of Memory Error...
For more detailed output, check the application tracking page: http://dcwipphm12002.edc.nam.gm.com:8088/cluster/app/application_1580875150091_97539 Then click on links to logs of each attempt. . Failing the application.
Looks like it is a memory error. Is there any way I could optimize my query?