The environment is: JDK 1.7; CDH 5.8.0
The code is
from pyspark.ml.feature import PCA
from pyspark.mllib.linalg import Vectors
data = [(Vectors.sparse(5, [(1, 1.0), (3, 7.0)]),),
(Vectors.dense([2.0, 0.0, 3.0, 4.0, 5.0]),),
(Vectors.dense([4.0, 0.0, 0.0, 6.0, 7.0]),)]
df = sqlContext.createDataFrame(data,["features"])
pca = PCA(k=2, inputCol="features", outputCol="pca_features")
model = pca.fit(df)
The error stack is
[Stage 2:> (0 + 1) / 2]/usr/java/jdk1.7.0_67-cloudera/bin/java: symbol lookup error: /tmp/jniloader73074 80764352992550netlib-native_system-linux-x86_64.so: undefined symbol: cblas_daxpy
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 47504)
Traceback (most recent call last):
File "/usr/lib64/python2.7/SocketServer.py", line 295, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib64/python2.7/SocketServer.py", line 321, in process_request
self.finish_request(request, client_address)
File "/usr/lib64/python2.7/SocketServer.py", line 334, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib64/python2.7/SocketServer.py", line 649, in __init__
self.handle()
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/pyspark/accumulators.py", line 235, in handle
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 690, in start
self.socket.connect((self.address, self.port))
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/pyspark/ml/pipeline.py", line 69, in fit
num_updates = read_int(self.rfile)
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/pyspark/serializers.py", line 545, in read_int
return self._fit(dataset)
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/pyspark/ml/wrapper.py", line 133, in _fit
java_model = self._fit_java(dataset)
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/pyspark/ml/wrapper.py", line 130, in _fit_java
return self._java_obj.fit(dataset._jdf)
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 811, in __call__
raise EOFError
EOFError
----------------------------------------
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 631, in send_command
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 624, in send_command
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 579, in _get_connection
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 585, in _create_connection
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 697, in start
py4j.protocol.Py4JNetworkError: An error occurred while trying to connect to the Java server
>>> ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 690, in start
self.socket.connect((self.address, self.port))
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/pyspark/context.py", line 224, in signal_handler
self.cancelAllJobs()
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/pyspark/context.py", line 909, in cancelAllJobs
self._jsc.sc().cancelAllJobs()
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 811, in __call__
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 624, in send_command
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 579, in _get_connection
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 585, in _create_connection
File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 697, in start
py4j.protocol.Py4JNetworkError: An error occurred while trying to connect to the Java server
The things about this issue Python Spark Context can't connect to the Py4J Spark Context because of the Py4J java server down which is caused by
symbol lookup error: /tmp/jniloader73074 80764352992550netlib-native_system-linux-x86_64.so: undefined symbol: cblas_daxpy
So, the python Spark Context can't connect to Py4J Spark context which shows Py4J Spark context ('127.0.0.1', 47504) Connection refused
Another proof is in the executor log, it shows
CoarseGrainedExecutorBackend: An unknown (executor_IP:executor_port) driver disconnected
CoarseGrainedExecutorBackend: Driver (executor_IP:executor_port) disassociated! Shutting down
It means the executor can't connect to the Py4J Spark context as well.
yarn logs -applicationId application_xxxxxxxxx_xxxxxx
Container: container_e37_1484199111776_8460_01_000001 on node_xxxxx
LogType:stderr
Log Upload Time:Mon Feb 20 11:18:07 +1300 2017
LogLength:94
Log Contents:
17/02/20 11:18:05 WARN yarn.YarnAllocator: Expected to find pending requests, but found none.
LogType:stdout
Log Upload Time:Mon Feb 20 11:18:07 +1300 2017
LogLength:0
Log Contents:
Container: container_e37_1484199111776_8460_01_000002 on node_xxxxx_2
LogType:stderr
Log Upload Time:Mon Feb 20 11:18:07 +1300 2017
LogLength:250
Log Contents:
17/02/20 11:18:06 WARN executor.CoarseGrainedExecutorBackend: An unknown (driver IP:PORT) driver disconnected
LogType:stdout
Log Upload Time:Mon Feb 20 11:18:07 +1300 2017
LogLength:0
Log Contents:
Any idea why?