I am trying to import a module that I have within another repo in databricks, however, spark udf cannot find the module. I can import the module normally and it only fails with the pyspark udf.
I have referenced this stackoverflow post, but the issue is that on our team we work on shared clusters and I do not wish to change the environment. The other method we use is to generate an egg file, however this process is not conducive to quick iteration and testing especially with a shared cluster.
The error:
PythonException: An exception was thrown from a UDF: 'pyspark.serializers.SerializationError: Caused by Traceback (most recent call last):
File "/databricks/spark/python/pyspark/serializers.py", line 165, in _read_with_length
return self.loads(obj)
File "/databricks/spark/python/pyspark/serializers.py", line 466, in loads
return pickle.loads(obj, encoding=encoding)
ModuleNotFoundError: No module named 'test'. Full traceback below:
Traceback (most recent call last):
File "/databricks/spark/python/pyspark/serializers.py", line 165, in _read_with_length
return self.loads(obj)
File "/databricks/spark/python/pyspark/serializers.py", line 466, in loads
return pickle.loads(obj, encoding=encoding)
ModuleNotFoundError: No module named 'test'
During handling of the above exception, another exception occurred:
pyspark.serializers.SerializationError: Caused by Traceback (most recent call last):
File "/databricks/spark/python/pyspark/serializers.py", line 165, in _read_with_length
return self.loads(obj)
File "/databricks/spark/python/pyspark/serializers.py", line 466, in loads
return pickle.loads(obj, encoding=encoding)
ModuleNotFoundError: No module named 'test'
Where I believe the issue originates from (basically during the withColumn with udf):
I can import with no issues
from testimport pyspark_utils
Basically I would like to know if its possible to import custom modules that pyspark can use within databricks repos using the files in repos without the need to build a wheel or egg file or modifying the clusters in ways that may cause conflicts. Thanks for any help or information!