1

I have the following DataFrame

----------
|"ARR"   |
----------
|[       |
|  "A",  |
|  "B",  |
|  "C"   |
|]       |
----------

Not I want to check whether a specific element is in ARR:

This gives a Error (failed to execute query...):

df\
    .withColumn("contains_A",array_contains("A",col("ARR")))\
    .show()

same here

df\
    .withColumn("contains_A",array_contains(lit("A"),col("ARR")))\
    .show()

and this gives FALSE

df\
    .withColumn("contains_A",array_contains(array_construct(lit("A")),col("ARR")))\
    .show()

How to do it?

Raphael Roth
  • 26,751
  • 15
  • 88
  • 145

1 Answers1

1

The value you search for in ARRAY_CONTAINS must be a variant

with t0(arr) as (
    select parse_json('["a","b","c"]')
)
select array_contains('a'::variant, arr) CONTAINS_A
from t0;

CONTAINS_A
True

In Snowpark, I believe this should give you what you want: df.withColumn("contains_A",sql_expr("array_contains('A'::variant,ARR)")).show()

Dave Welden
  • 1,122
  • 6
  • 6
  • but how to do it in Snowpark Python API? I could not find out how to mark something as a "variant" – Raphael Roth Aug 23 '22 at 11:20
  • Updated my answer with Snowpark sql_expr code. – Dave Welden Aug 23 '22 at 13:42
  • yes with sql_expr it works, but I was looking for an non-SQL solution, but maybe its not possible... thanks anyway – Raphael Roth Aug 23 '22 at 18:48
  • 1
    If you use `lit("B").cast("variant")`, you can do it fully in Snowpark. In the proposed solution, replace the call of `sql_expr` by: `array_contains(lit("A").cast("variant"), col("ARR"))` – drmaettu Aug 24 '22 at 06:30
  • @drmaettu How do you use array_contains without sql_expr? This code failed: `>>> df.withColumn("contains_a",array_contains(lit("A").cast("variant"), col("ARR"))) Traceback (most recent call last): File "", line 1, in NameError: name 'array_contains' is not defined` – Dave Welden Aug 24 '22 at 13:47