I have been testing storing objects in pandas dataframes trying to learn the limitations, and one of the things I am currently testing involves trying to store the results of a plot in a dataframe column, but I cannot figure out how to reference it properly.
Object Definition
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
class point:
def __init__(self, coordinate):
self.x = coordinate[0]
self.y = coordinate[1]
class series:
def __init__(self, series):
self.series = np.asarray(series)
def plot(self):
fig, ax = plt.subplots(figsize=(5,5))
z = ax.scatter(self.series[:,0], self.series[:,1])
return z
Test Dataframe:
col_index = [i for i in range(1)]
col_coordinates = [[[i, i**2] for i in range(100)]]
df = pd.DataFrame({"i": col_index, "coordinates": col_coordinates})
Application:
df['series'] = df.apply(lambda x: series(x['coordinates']), axis=1)
df['plot'] = df['series'].apply(lambda x: x.plot())
df.head()
Attempting to Reference the Plot Later:
fig,ax = plt.subplots(figsize=(5,5))
ax = df['plot'].iloc[0]
As you can see the plot renders properly when the column in the dataframe is filled in, but it does not retain the plot itself when I try to reference it later. I'm sure I'm probably saving the wrong thing, but I'm not sure what the right thing is.
Secondary question is, can the plot itself be saved into the column without it first being rendered?