1

Problem: I am using the ydata_profiling library to generate a profile report on my data set and store it as a html file. I appear to be able to generate the report but when I attempt to save it as a html file I receive the following error:

File "/Users/jt/.local/lib/python3.10/site-packages/ydata_profiling/profile_report.py", line 253, in description_set\n self._description_set = describe_df(\n', ' File "/Users/jt/.local/lib/python3.10/site-packages/ydata_profiling/model/describe.py", line 57, in describe\n raise ValueError("Can not describe a lazy ProfileReport without a DataFrame.")\n']

Relevant Code:

from ydata_profiling import ProfileReport #https://pypi.org/project/ydata-profiling/
#1. get data and mask (indicating which features to analyse)
self._data_df = pd.read_csv(self.inputfname_csv, header=0)
self._mask_df = pd.read_csv(self.inputmask_csv, header=0)
#2. drop unwanted cols
self._selected_columns = self._mask_df.columns[self._mask_df.isin([MASK_ON]).any()]
self._selected_columns = self._selected_columns.values.tolist()
self._data_df = self._data_df.drop(columns=[col for col in self._data_df if col not in self._selected_columns], inplace=True)
#3. generate report
self._data_report_df = ProfileReport(self._data_df, title="Pandas Profiling Report")
#4. save report in html format
self._data_report_df.to_file(self.outputfname_html) # exception raised here!!!

Environment:

Apple MacBook Pro M1 chip
macOS==Ventura 13.4.1 (c)
Visual Studio Code Version 1.72.2
python==3.10.12
conda                         23.7.2
joblib                        1.3.0
matplotlib                    3.7.1
matplotlib-inline             0.1.6
modin                         0.23.0
numba                         0.57.0
numpy                         1.23.5
pandas                        2.0.3
pandas-profiling              3.6.6
scikit-learn                  1.3.0
scipy                         1.11.1
seaborn                       0.12.2
ydata-profiling               4.5.0
zstandard                     0.19.0

Thanks for your time and any help much appreciated

I have attempted possible solutions detailed in the following websites:

JonT
  • 13
  • 4
  • Does behavior change when you delete `, inplace=True` ? (Typically you want to avoid that kwarg -- better to just a create a new DF which shares references to some elements with the old one.) Also, I assume the report works fine with the raw (unmasked) DF you loaded from `inputfname_csv` ? – J_H Aug 11 '23 at 14:40
  • Yes on both counts. Removing `inplace=True` generated the report AND enabled me to store it to an html file!! Yay! Many thanks J_H for your solution! :-) – JonT Aug 12 '23 at 10:18

1 Answers1

0

Remove that , inplace=True keyword, as it is not doing you any favors, and it leaves you with a more tangled nest of references in the result object.

Typically you want to avoid that kwarg -- better to just a create a new DF which shares references to some elements with the old one.

I have only encountered the dreaded SettingWithCopyWarning a few times, but that was a few too many and I changed my habits. Please refer to Inplace Considered Harmful.

J_H
  • 17,926
  • 4
  • 24
  • 44