This is another approach i worked out.
It involves various statements, however, all of these statements can be combined in a single one to produce the desired output.
After creating the initial dataframe named 'df',
df.show(5,False)
+---+----------------------------+
|id |data |
+---+----------------------------+
|001|[{"index": 1}, {"index": 2}]|
|002|[{"index": 3}, {"index": 4}]|
+---+----------------------------+
df2 = df.select(col('id'),split(df.data,',').alias('list'))
This creates a dataframe named 'df2' that has second column split up into array type.
df2.show(5,False)
+---+-------------------------------+
|id |list |
+---+-------------------------------+
|001|[[{"index": 1}, {"index": 2}]]|
|002|[[{"index": 3}, {"index": 4}]]|
+---+-------------------------------+
then,
running the explode function,
df3 = df2.select(col('id'),explode(df2.list))
df3.show(5,False)
+---+--------------+
|id |col |
+---+--------------+
|001|[{"index": 1} |
|001| {"index": 2}]|
|002|[{"index": 3} |
|002| {"index": 4}]|
+---+--------------+
followed by ,
df4 = df3.select(col('id'),regexp_extract('col','(\d+)',1).alias('no_only'))
this transformation check for number in the exploded column.
df4.show(5,False)
+---+-------+
|id |no_only|
+---+-------+
|001|1 |
|001|2 |
|002|3 |
|002|4 |
+---+-------+