1

I am using databricks on MS Azure and am getting this warning everytime I convert a pandas dataframe to Pyspark dataframe:

/databricks/spark/python/pyspark/sql/pandas/conversion.py:539: FutureWarning: iteritems is deprecated and will be removed in a future version. Use .items instead.
  arrow_data = [[(c, t) for (_, c), t in zip(pdf_slice.iteritems(), arrow_types)]

The code I am using is:

df_spk = spark.createDataFrame(df_pd)

Since I cannot find an alternative to the line above, I cannot use a newer version of Python. Does anyone have any ideas?

Thank you in advance, T

Pratik Lad
  • 4,343
  • 2
  • 3
  • 11
Tanjil
  • 198
  • 1
  • 17
  • Is there a newer version of pyspark and/or pandas you can use, that maybe fixes this issue? – John Gordon Apr 09 '23 at 22:38
  • it's just a warning, that will be fixed soon: https://stackoverflow.com/questions/75926636/databricks-issue-while-creating-spark-data-frame-from-pandas/75926954#75926954 – Alex Ott Apr 11 '23 at 12:52

1 Answers1

3

I think the warning message you are seeing is related to a FutureWarning in Python,
Indicating that the iteritems() method is deprecated and will be removed in a future version.
This warning message is coming from the pandas to PySpark DataFrame conversion process.

To Ignore or not show this kind of FutureWarning message,
You can use the below code snippet in the beginning of your script

import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

enter image description here

The above code snippet will ignore all FutureWarning messages in your code.
If you want to ignore only the FutureWarning messages related to the iteritems() method,
you can use the Below code snippet

import warnings
warnings.filterwarnings("ignore", message="iteritems is deprecated")

enter image description here

Also please consider to try upgrading to a newer version of PySpark or pandas that uses the .items() method instead of iteritems().

B. B. Naga Sai Vamsi
  • 2,386
  • 2
  • 3
  • 11