13

I have a table pos.pos_inv in hdfs which is partitioned by yyyymm. Below is the query:

select DATE_ADD(to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),5), 
       to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),yyyymm 
   from pos.pos_inv inv 
      INNER JOIN pos.POSActvyBrdg Brdg ON Brdg.EIS_POSActvyBrdgId = Inv.EIS_POSActvyBrdgId 
      where to_date(from_unixtime(unix_timestamp(Inv.nrmlzdwkenddt, 'MM/dd/yyyy'))) 
       BETWEEN DATE_SUB(to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),6) 
        and DATE_ADD(to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),6) 
        and inv.yyyymm=201501

I have provided the partition value for the query as 201501, but still i get the error"

 Error while compiling statement: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "inv" Table "pos_inv"

(schema)The partition, yyyymm is int type and actvydt is date stored as string type.

Alvaro Silvino
  • 9,441
  • 12
  • 52
  • 80
jeff
  • 157
  • 1
  • 2
  • 8
  • 1
    jeff, which version of hive you are using , I think there are some open jira issues (https://issues.apache.org/jira/browse/HIVE-4905, https://issues.apache.org/jira/browse/HIVE-10454 ) related to this issue when partition predicate column is used in the combination of Join, if you are using older version of hive try setting mapred mode to nonstrict and run the query `set hive.mapred.mode=nonstrict` – Aditya Aug 22 '16 at 07:57
  • 2
    If i use set hive.mapred.mode=nonstrict , then the whole data set is scanned rather than the particular partition value. – jeff Aug 23 '16 at 02:21
  • jeff, I thought its a workaround just for now , may be this patches are updated in latest version , which version of hive is yours , it it possible to try in the higher version? – Aditya Aug 23 '16 at 04:27
  • 3
    Workaround: Wrap the offending table in a subquery with the partition predicate, (select * from table where partition > value) this avoids the error and avoids scanning the entire table. – justin cress Mar 16 '17 at 14:47
  • If you you use the `nonstrict` workaround for queries sent with beeline from the command-line (like `beeline -u "jdbc:..." -e "select ..."`) you'll need to add it on the command-line: `--hivevar hive.mapred.mode=unstrict` – DouglasDD Jul 02 '21 at 10:24

4 Answers4

19

This happens because hive is set to strict mode. this allow the partition table to access the respective partition /folder in hdfs .

  set hive.mapred.mode=unstrict;  it will work 
GreenGiant
  • 4,930
  • 1
  • 46
  • 76
Alvaro Silvino
  • 9,441
  • 12
  • 52
  • 80
  • 6
    -1 this is not the correct use. You should filter your select statement with appropriate predicate. For, example. if table has one predicate named 'dt' then Select * from MyTable where dt='2017/10/10' – om471987 May 12 '18 at 05:56
1

In your query error it is said: No partition predicate found for Alias "inv" Table "pos_inv".

So you must put the where clause for the fields of the partitioned table (for pos_inv), and not for the other one (inv), as you've done.

Matias Eiletz
  • 419
  • 3
  • 14
1

set hive.mapred.mode=unstrict allows you access the whole data rather than the particular partitons. In some case read whole dataset is necessary, such as: rank() over

PPW
  • 21
  • 1
0

This happens when the 2 tables have the same column name (possibly same partition column). Try to deal with separate tables with where condition like below

WITH tableA as
(
-- All your where clause here
),

tableB AS
(
-- All your where clause here
)
select tableA.*, tableB.*
Yas
  • 1
  • 2