0

I have this schema

[(
'guid', 'string'),
(‘shop_time, 'array<struct<seconds:int,shop:string>>'),
]

and this data:

|006435a231e26c67df63340d1b3c2a5d4e|
[10, Texmex], [5, BurgerB]]
|006435asdsdsdsdderere340d1b3c2a5d4e|
[[20, Check], [10, Test]]

I would like this schema

[(
'guid', 'string'),
(‘shop_time, 'array<struct<seconds:int,shop:string,time_diff_sec:int>>'),
]

and this data as the result

|006435a231e26c67df63340d1b3c2a5d4e|
[10, Texmex,0], [5, BurgerB,5]]
|006435asdsdsdsdderere340d1b3c2a5d4e|
[[20, Check,0], [10, Test,10]]

So a new array column is added in struct named time_diff_sec and difference from previous column(upper) is calculated. First array col is 0/null because there is no previous element. I can do this using a window function in a normal dataframe. But how to do it in a struct array. Is there a window function for arrays? I know how to add a new column as given here , but how to refer to a previous column in an array?

Blue Clouds
  • 7,295
  • 4
  • 71
  • 112

0 Answers0