I wish to remove the last element of the array from this DataFrame. We have this link demonstrating the same thing, but with UDFs
and that I wish to avoid. Is there is simple way to do this - something like list[:2]
?
data = [(['cat','dog','sheep'],),(['bus','truck','car'],),(['ice','pizza','pasta'],)]
df = sqlContext.createDataFrame(data,['data'])
df.show()
+-------------------+
| data|
+-------------------+
| [cat, dog, sheep]|
| [bus, truck, car]|
|[ice, pizza, pasta]|
+-------------------+
Expected DataFrame:
+--------------+
| data|
+--------------+
| [cat, dog]|
| [bus, truck]|
| [ice, pizza]|
+--------------+