3

I have json column with keys from d1 to d32 and those keys have certain values. I want to now query rows where a value doesnt exist in any of the keys.

Like - any dimension not equal to abc. So i am checking d1!=abc or d2!= abc..... d32!=abc. Is there a better way to do it?

Each row is a json field with any combination of d1-d32 dimensions and their corresponding values

Ab148
  • 31
  • 2
  • 1
    I'm not sure I understand your question correctly, an example of the data structure would be helpful. – Dennis Jan 10 '21 at 12:56
  • 1
    Table has a column dimensions which is json type. The json is like below - {d1: abc, d2:aaa, d3:bb} and many rows of json like this. What i want to do is search rows where the json value is abc, but i do not want to hardcode the json keys when checking them like - dimensions.d1 = abc or dimensions.d1 = abc. Is there a way to search this withiut hardcoding d1 d2 d3 etc – Ab148 Jan 10 '21 at 14:41
  • 1
    You can write a JavaScript UDF to search for the values and return a JSON variant. Pass in the variant JSON to search and the value to find. From there it's just looping through and building the output JSON. – Greg Pavlik Jan 10 '21 at 22:15

2 Answers2

2

With flatten() you can get the name and values of every key inside a json:

with variants as (
select 1 id, parse_json('{"d1":1, "d2":2, "d3":3}') data
union all select 2, parse_json('{"d1":4, "d2":5, "d3":7}') 
union all select 3, parse_json('{"d1":2, "d2":0, "d3":0}') 
)

select * 
from variants, table(flatten(data));
;

enter image description here

Given that information, you can look into all the keys named like d% and look for a value that doesn't exist in any - let's say 2:

with variants as (
select 1 id, parse_json('{"d1":1, "d2":2, "d3":3}') data
union all select 2, parse_json('{"d1":4, "d2":5, "d3":7}') 
union all select 3, parse_json('{"d1":2, "d2":0, "d3":0}') 
)

select id, not boolor_agg((iff(key like 'd%', value=2, true))) doesnt_have_a_2
from variants, table(flatten(data))
group by id
;

enter image description here

That shows you that the row with id=2 is the only row where no key has the value 2.

As an alternative, you could also filter the key names in the where clause instead of iff:

with variants as (
select 1 id, parse_json('{"d1":1, "d2":2, "d3":3}') data
union all select 2, parse_json('{"d1":4, "d2":5, "d3":7}') 
union all select 3, parse_json('{"d1":2, "d2":0, "d3":0}') 
)

select id, boolor_agg(value=2) 
from variants, table(flatten(data))
where key like 'd%'
group by id
;
Felipe Hoffa
  • 54,922
  • 16
  • 151
  • 325
  • 1
    Thanks, but i think you got the question wrong. I qm trying to build a "any" filter - which primarily means search of a record which has any dimension with a particular value. Assume that the dimensions are in a json column with d1 - d32 as different dimensions (json keys). Each record can have any number of dimension keys with its values. What i am doing right now - array_construct(d1,d2,d3,d4...).toString -> DOES NOT contain(SNAKE). How do i not hardcode the d1 - d32 keys – Ab148 Jan 12 '21 at 21:58
  • 1
    Isn't that exactly what I did? Notice that I have the dimensions d1,d2,etc... and they are not hardcoded in the SELECT query. Nevertheless I manage to find all the records where no key has a value of "2". – Felipe Hoffa Jan 12 '21 at 22:40
  • 1
    Sorry.. my mistake. I am trying to implement 'any of the' clause without hardcoding dimensions. Eg- d1 : Snake, d2 : Platform next row d1 : Tetris, d2 : Platform I want to return both rows, since both have any one dimension not equal to snake. Even if i flatten the json, how to check if any one dimension doesnt match a value? – Ab148 Jan 13 '21 at 02:33
  • 1
    Problem is that i can have at max 48 dimensions and i dont want to go checking d1!= snake or d2!=snake...... d48!=snake as a row can have any combination of dimensions. – Ab148 Jan 13 '21 at 02:36
  • 1
    I didn't hardcode anything. If there was a d48, that would also be covered under the `boolor_agg(value=2) ` . For each dimension it looks if it's equal to 2, and then does an 'or' over all of them. If any of the d1,d2,d3,..,d48 is equal to 2, then the result of the `boolor_agg()` will be true, otherwise false. Please try it with your data. – Felipe Hoffa Jan 13 '21 at 02:58
1

In my opinion you can try to code a Stored Procedure or a JavaScript UDF. Your Stored Procedure has to loop over all your rows and for each row it has to check whether there is a corresponding dimension (1-32). If the dimension exists, it also has to check for the value.

Stored Procedures: https://docs.snowflake.com/en/sql-reference/stored-procedures-usage.html

UDFs: https://docs.snowflake.com/en/sql-reference/udf-js.html

Marcel
  • 2,454
  • 1
  • 5
  • 13