2

I have installed Spark 2.1 with from Cloudera. When I launch spark-shell from /usr/bin/spark2-shell it runs (with scala). When I launch Pyspark I get this problem

sudo -u hdfs ./pyspark2

I get:

java.sql.SQLException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, username = APP. Terminating connection pool. Original Exception: ------
java.sql.SQLException: Failed to create database 'metastore_db', see the next exception for details.
......
Caused by: ERROR XBM0H: Directory /usr/bin/metastore_db cannot be created.
Caused by: java.sql.SQLException: Failed to create database
'metastore_db', see the next exception for details
.....
Caused by: ERROR XJ041: Failed to create database 'metastore_db', see the next exception for details.
        at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
        at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source)
        ... 105 more
Caused by: ERROR XBM0H: Directory /usr/bin/metastore_db cannot be created.

Traceback (most recent call last):
  File "/opt/cloudera/parcels/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658/lib/spark2/python/pyspark/shell.py", line 43, in <module>
    spark = SparkSession.builder\
  File "/opt/cloudera/parcels/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658/lib/spark2/python/pyspark/sql/session.py", line 179, in getOrCreate
    session._jsparkSession.sessionState().conf().setConfString(key, value)
  File "/opt/cloudera/parcels/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
  File "/opt/cloudera/parcels/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658/lib/spark2/python/pyspark/sql/utils.py", line 79, in deco
    raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"

I think this is a problem while creating HiveContext from pyspark. Also how to run pyspark without creating a HiveContext. Any help would be appreciated.

Michail N
  • 3,647
  • 2
  • 32
  • 51

0 Answers0