0

I'm having a series in a DataFrame where it is in object format but I want to loop each row and add values to a single list without a list of lists.

df['Fruits']
    
 index  Fruits
  0    ['banana']
  1    ['apple','grapes(imported,us)','mango']
  2    ['apple']
  3    ['mango','grapes(imported,US)','pears(imported,NZ)']
  4    ['mango']
  dtype: object

fruits_list = []

for i in df['Fruits']:
    fruits_list.append(i)

 Expected Output:
    fruits_list = ['banana', 'apple','grapes(imported,us)','mango', 'apple', 'mango','grapes(imported,US)','pears(imported,NZ)', 'mango'] 
ddejohn
  • 8,775
  • 3
  • 17
  • 30
swarna
  • 300
  • 4
  • 9
  • The quotes aren't actually in the strings themselves, they're how strings are delimited. Try printing `'[apple]'` and it'll display `[apple]`. You simply can't remove them. Please explain what you want to do with this data. – ddejohn Oct 19 '21 at 04:26
  • @ddejohn actually series in string format but I'm trying to convert it to list and and then loop it and add each value to new list so that all fruit names will be in single list. This is what I'm trying to explain. – swarna Oct 19 '21 at 04:31
  • You seem to have made a copy-paste error because what you have aren't valid Python strings. – ddejohn Oct 19 '21 at 04:32
  • @ddejohn I rephrased the question can you please check this once. Thank you. – swarna Oct 19 '21 at 04:43

2 Answers2

1

Why was your data in this format to begin with? What you had were strings, not lists of strings:

In [2]: for item in df["Fruits"]:
   ...:     print(type(item), item)
   ...:
<class 'str'> ['banana']
<class 'str'> ['apple','grapes(imported,us)','mango']
<class 'str'> ['apple']
<class 'str'> ['mango','grapes(imported,US)','pears(imported,NZ)']
<class 'str'> ['mango']

So you can use ast.literal_eval() to convert these strings to lists of strings, and then use list.extend() to obtain a flattened list of all the items:

In [3]: import ast

In [4]: fruits = []
   ...: for item in df["Fruits"]:
   ...:     fruits.extend(ast.literal_eval(item))
   ...:

In [5]: fruits
Out[5]:
['banana',
 'apple',
 'grapes(imported,us)',
 'mango',
 'apple',
 'mango',
 'grapes(imported,US)',
 'pears(imported,NZ)',
 'mango']
ddejohn
  • 8,775
  • 3
  • 17
  • 30
1

Using and summarizing the following answer: How to make a flat list of lists.

Setup:

df = pd.DataFrame({"Fruits": [['banana'], ['apple','grapes(imported,us)','mango'], ['apple'], ['mango','grapes(imported,US)','pears(imported,NZ)'], ['mango']]})

I suggest two options:

Option 1 for small data sets ~10 elements:

sum(df['Fruits'], [])

Option 2 for larger data sets > 10 elements:

from itertools import chain
chain.from_iterable(df['Fruits'])

Output:

 ['banana',
 'apple',
 'grapes(imported,us)',
 'mango',
 'apple',
 'mango',
 'grapes(imported,US)',
 'pears(imported,NZ)',
 'mango']
Eran Yogev
  • 891
  • 10
  • 20
  • This would work if OP had lists of strings in their dataframe, but from the way it sounded, they actually had literal strings that looked like lists of strings. – ddejohn Oct 19 '21 at 05:03