I want to fetch the data of the column "Examples" with respect to column " Category"
Output:
Fruits [Apple,Mango,Orange,Mosambi]
Veggies [Tomato,Onion,Brinjal]
Readymade [Maggi,Ab,Mixes,Foh]
I want to fetch the data of the column "Examples" with respect to column " Category"
Output:
Fruits [Apple,Mango,Orange,Mosambi]
Veggies [Tomato,Onion,Brinjal]
Readymade [Maggi,Ab,Mixes,Foh]
If I understood correct, you want to unpack each list that contains a
few lists in the Example
column.
One way is to use numpy's ravel
function. Assuming your dataframe is df
:
import numpy as np
df["Examples"] = df["Examples"].apply(lambda x: np.concatenate(x).ravel())
Category Examples
0 Fruits [Apple, Mango, Orange, Mosambi]
1 Veggies [Tomato, Onion, Brinjal]
2 Readymade [Maggi, Ab, Mixes, Foh]
Edit:
As per the comment, some elements in the Example
column are not list of lists (my above assumption is wrong). That can be handled by adding a check like below:
df["Examples"].apply(lambda x: np.concatenate(x).ravel() if all(isinstance(i, list) for i in x) and len(x)!=0 else x)
Note: There can be many combinations of possibilities here. The above check will assume if all the entries in a list are not lists, then they are just strings and also handles the empty list. But, there can be cases where an element is not a list but of some other data type, and then this check will be wrong.
Demo:
(As per the next comment, if a collection of strings are preferred instead of a list, applying df["Examples"].apply(lambda x: ','.join(x))
would do that.
df = pd.DataFrame([
["Fruits", [["Apple","Mango"],["Orange","Mosambi"]]],
["Veggies", [["Tomato","Onion"],["Brinjal"]]],
["Readymade", [["Maggi","Ab"],["Mixes","Foh"]]],
["Test", ["a", "b"]],
["Testt", []]
],
columns=["Category", "Examples"]
)
df["Examples"].apply(lambda x: np.concatenate(x).ravel() if all(isinstance(i, list) for i in x) and len(x)!=0 else x)
df["Examples"] = df["Examples"].apply(lambda x: ','.join(x))
df
Category Examples
0 Fruits Apple,Mango,Orange,Mosambi
1 Veggies Tomato,Onion,Brinjal
2 Readymade Maggi,Ab,Mixes,Foh
3 Test a,b
4 Testt
You can use .flatten()
df['examples_flat'] = df['examples'].apply(lambda x: np.asarray(x).flatten())
[UPDATE]
To get values as strings separated by commas, instead of list, use ", ".join()
df['examples_flat'] = df['examples'].apply(lambda x: ", ".join(np.asarray(x).flatten()))
category examples examples_flat
0 fruits []
1 veggies [[e, f], [g, h]] e, f, g, h
2 readymade [[i, j], [k, l]] i, j, k, l