0

I'm using spark sql to create a query. The actual query is bit large in size but the issue I'm getting is the following error

Error in SQL statement: AnalysisException: IN/EXISTS predicate sub-queries can only be used in a Filter: Aggregate

And the following part of the query is causing the trouble. I am not sure

select col1, col2,
sum(case
    WHEN snoozed_until is NULL
        AND hired is NULL
        AND lower(profile_archive_status) = 'true'
        AND profile_id NOT IN 
    (SELECT profile_id
    FROM candidates_feedback f
    WHERE lower(f.a1) LIKE '%no%') THEN
        1
        ELSE 0 END) Archived, 
        sum(case
        WHEN snoozed_until is NULL
            AND hired is NULL
            AND lower(profile_archive_status) = 'true'
            AND profile_id IN (SELECT profile_id FROM candidates_feedback f WHERE lower(f.a1) LIKE '%no%') THEN 1 ELSE 0 END) Rejected,
from table;

petereg157
  • 39
  • 8
  • Provide sample data, desired results, and an explanation of what you want to accomplish. – Gordon Linoff May 20 '20 at 11:39
  • @GordonLinoff I've update the query, I hope its making sense now. Similar kind of thing is here. https://stackoverflow.com/questions/34861516/spark-replacement-for-exists-and-in/34866817 – petereg157 May 20 '20 at 12:10
  • the IN/EXISTS predicates are not allowed in the select projection. the query analyser says it only accpted in the filter. Please try to rework the query – Som May 21 '20 at 02:42

1 Answers1

0

Use left join with sub-query:

select col1, col2,
sum(CASE
       WHEN snoozed_until is NULL
        AND hired is NULL
        AND lower(profile_archive_status) = 'true'
        AND p.profile_id IS NULL THEN 1 ELSE 0 
     END) Archived, 
sum(CASE
        WHEN snoozed_until is NULL
            AND hired is NULL
            AND lower(profile_archive_status) = 'true'
            AND p.profile_id IS NOT NULL THEN 1 ELSE 0 
    END) Rejected
from table t
     left join 
     (SELECT DISTINCT profile_id
        FROM candidates_feedback f
       WHERE lower(f.a1) LIKE '%no%') p on t.profile_id=p.profile_id
leftjoin
  • 36,950
  • 8
  • 57
  • 116