I am trying to use mmlspark package in pyspark and not able to import the model.
My jupyter notebook is connected to the cluster. I have included the package details in my sparksession as below. In spark UI connected to the cluster I can see the jars added in spark.yarn.dist.jars. But we I import mmlspark inside the notebook - I get a message package not found. Is there something I am missing. Thanks
from pyspark import SparkConf, SparkContext
from pyspark.sql import SparkSession
conf = (SparkConf() \
.setAppName("dataPipeline") \
.set("spark.jars.packages", "Azure:mmlspark:0.13")
.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") \
.set("spark.dynamicAllocation.enabled", "False") \
.set("spark.executor.memory","8g") \
.set("spark.driver.memory","4g"))
spark = SparkSession.builder \
.master("yarn") \
.config(conf=conf) \
.enableHiveSupport() \
.getOrCreate()