I am trying to get last n elements of each array column named Foo and make a separate column out of it called as last_n_items_of_Foo. Foo column array has variable length
I have looked at this article here but it has a method which cannot be used to access last elements.
import pandas as pd
from pyspark.sql.functions import udf, size
from pyspark.sql.types import StringType
from pyspark.sql.functions import col
df = pd.DataFrame([[[1,1,2,3],1,0],[[1,1,2,7,8,9],0,0],[[1,1,2,3,4,5,8],1,1]],columns = ['Foo','Bar','Baz'])
spark_df = spark.createDataFrame(df)
Here is how output should look
if n=2
Foo Bar Baz last_2_items_of_Foo
0 [1, 1, 2, 3] 1 0 [2, 3]
1 [1, 1, 2, 7, 8, 9] 0 0 [8, 9]
2 [1, 1, 2, 3, 4, 5, 8] 1 1 [5, 8]