I'm executing the following code to convert a pyspark dataframe into pandas dataframe
dt = '2022-03-22'
sample_df = spark.sql(f'''select * from orders where order_date = '{dt}' limit 10''')
sample_df.toPandas()
but it throws the following error
File ~/conda/envs/custom_env/lib/python3.9/site-packages/pandas/_libs/tslibs/timezones.pyx:134, in pandas._libs.tslibs.timezones.maybe_get_tz()
File ~/conda/envs/custom_env/lib/python3.9/site-packages/pytz/__init__.py:188, in timezone(zone)
186 fp.close()
187 else:
--> 188 raise UnknownTimeZoneError(zone)
190 return _tzinfo_cache[zone]
UnknownTimeZoneError: 'IST'
Can anyone please explain what is going on here or provide any resolution?
I can see the result when I don't convert dataframe to pandas, for example sample_df.show()
works.