-2

I've got a list of PySpark Row as:

data_list_array = [Row(url='[a,b,c]'),Row(url='[d,b,c]')]
my_list = [(i.url) for i in data_list_array]
print(my_list)

returns me

['[a,b,c]', '[d,b,c]']

But i'm wanting my final data to be as:

[['a','b','c'], ['d','b','c']]

Is there anyway i can convert from list of strings to list of list ?

sulav_lfc
  • 772
  • 2
  • 14
  • 34
  • 2
    Your final data is not valid python. You need quotes around the strings. – pault Dec 14 '18 at 21:38
  • 1
    You *could* do `[x.strip("[]").split(",") for x in my_list]` but it seems like you should fix the problem upstream where `data_list_array` is created. While this works for the specific example posted here, it does not generalize well (suppose the data contained valid commas or square brackets). – pault Dec 14 '18 at 21:43
  • List of list of what? I mean, what do you want to store in the inner lists? Strings? Variables, perhaps? – JoshuaCS Dec 14 '18 at 21:45
  • Related: [Convert string representation of list to list](https://stackoverflow.com/questions/1894269/convert-string-representation-of-list-to-list) – pault Dec 14 '18 at 21:48
  • @JosuéCortina I'm trying to create a `gensim` Dictionary as shown here: https://radimrehurek.com/gensim/corpora/dictionary.html The only issue i'm facing is pyspark is returning me the list of words as a string like i mentioned above – sulav_lfc Dec 14 '18 at 21:55

1 Answers1

1
desired_output = [s[1:-1].split(',') for s in my_list]
JoshuaCS
  • 2,524
  • 1
  • 13
  • 16