1

I want to plot a dataframe where y values are stored as ndarrays within a column i.e.:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame(index=np.arange(0,4), columns=('sample','class','values'))
for iloc in [0,2]:
    df.loc[iloc] = {'sample':iloc, 
                    'class':'raw', 
                    'values':np.random.random(5)}
    df.loc[iloc+1] = {'sample':iloc,
                      'class':'predict',
                      'values':np.random.random(5)}

grid = sns.FacetGrid(df, col="class", row="sample")
grid.map(plt.plot, np.arange(0,5), "value")

TypeError: unhashable type: 'numpy.ndarray'

Do I need to break out the ndarrays into separate rows? Is there a simple way to do this?

Thanks

Tom
  • 75
  • 5

1 Answers1

1

This is quite an unusual way of storing data in a dataframe. Two options (I'd recommend option B):

A. Custom mapping in seaborn

Indeed seaborn does not support such format natively. You may construct your own function to plot to the grid though.

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame(index=np.arange(0,4), columns=('sample','class','values'))
for iloc in [0,2]:
    df.loc[iloc] = {'sample':iloc, 
                    'class':'raw', 
                    'values':np.random.random(5)}
    df.loc[iloc+1] = {'sample':iloc,
                      'class':'predict',
                      'values':np.random.random(5)}

grid = sns.FacetGrid(df, col="class", row="sample")

def plot(*args,**kwargs):
    plt.plot(args[0].iloc[0], **kwargs)

grid.map(plot, "values")

B. Unnesting

However I would advise to "unnest" the dataframe first and get rid of the numpy arrays inside the cells.

pandas: When cell contents are lists, create a row for each element in the list shows a way to do that.

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame(index=np.arange(0,4), columns=('sample','class','values'))
for iloc in [0,2]:
    df.loc[iloc] = {'sample':iloc, 
                    'class':'raw', 
                    'values':np.random.random(5)}
    df.loc[iloc+1] = {'sample':iloc,
                      'class':'predict',
                      'values':np.random.random(5)}

res = df.set_index(["sample", "class"])["values"].apply(pd.Series).stack().reset_index()
res.columns = ["sample", "class", "original_index", "values"]

enter image description here

Then use the FacetGrid in the usual way.

grid = sns.FacetGrid(res, col="class", row="sample")
grid.map(plt.plot, "original_index", "values")
ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • Thank you, great answer. I was working on the custom mapping function but realised the in-nesting approach was probably better. The apply stack solution is neat. – Tom Feb 01 '19 at 18:14