I need to interrupt the program and throw the exception below if the two conditions are met, otherwise have the program continue. This works fine while only using the 1st condition, but yields an error when using both conditions. The below code should throw the exception if the DF is non-zero and the value for DF.col1 is not 'string.' Any tips to get this working?
if (DF.count() > 0) & (DF.col1 != 'string'):
raise Exception("!!!COUNT IS NON-ZERO, SO ADJUSTMENT IS NEEDED!!!")
else:
pass
This throws the error:
" Py4JError: An error occurred while calling o678.and. Trace:
py4j.Py4JException: Method and([class java.lang.Integer]) does not exist "
Some sample data:
from pyspark.sql.types import StructType,StructField, StringType, IntegerType
data2 = [("not_string","test")]
schema = StructType([ \
StructField("col1",StringType(),True), \
StructField("col2",StringType(),True) \
])
DF = spark.createDataFrame(data=data2,schema=schema)
DF.printSchema()
DF.show(truncate=False)