0

I created a docker container from this BigDL image. when I tried to collect the predictions using collect() this error occurs: Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. PS: the java version is 8 this is the code:

def retrain(self, batch_size):    
        minibatch =random.sample(self.experience_replay, batch_size)
        for state, action, reward, next_state in minibatch:
            state = np.asmatrix(state)
            next_state = np.asmatrix(next_state)
            print('state type',state)
            print('next state type',next_state)
            target = self.q_network.predict(state)
            p= target.collect()          
            tt = self.target_network.predict(next_state)
            t=tt.collect()
            p[0][action] = reward+self.gamma * np.amax(t)           
            self.q_network.fit(state, p, verbose=0)
        self.dqn_update_time-=1
        if self.dqn_update_time==0: 
          self.dqn_update_time=100 #dqn_time
          self.alighn_target_model()
          print('model updated')

This is the error:

    /tmp/ipykernel_1032/2958540146.py in retrain(self, batch_size)
         71             print('next state type',next_state)
         72             target = self.q_network.predict(state)
    ---> 73             p= target.collect()
         74 
         75             tt = self.target_network.predict(next_state)
    
    /opt/work/spark-3.1.2/python/lib/pyspark.zip/pyspark/rdd.py in collect(self)
        947         """
        948         with SCCallSiteSync(self.context) as css:
    --> 949             sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
        950         return list(_load_from_socket(sock_info, self._jrdd_deserializer))
        951 
    
    /usr/local/envs/bigdl/lib/python3.7/site-packages/py4j/java_gateway.py in __call__(self, *args)
       1303         answer = self.gateway_client.send_command(command)
       1304         return_value = get_return_value(
    -> 1305             answer, self.gateway_client, self.target_id, self.name)
       1306 
       1307         for temp_arg in temp_args:
    
    /opt/work/spark-3.1.2/python/lib/pyspark.zip/pyspark/sql/utils.py in deco(*a, **kw)
        109     def deco(*a, **kw):
        110         try:
    --> 111             return f(*a, **kw)
        112         except py4j.protocol.Py4JJavaError as e:
        113             converted = convert_exception(e.java_exception)
    
    /usr/local/envs/bigdl/lib/python3.7/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
        326                 raise Py4JJavaError(
        327                     "An error occurred while calling {0}{1}{2}.\n".
    --> 328                     format(target_id, ".", name), value)
        329             else:
        330                 raise Py4JError(
    
    
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 0.0 failed 1 times, most recent failure: Lost task 7.0 in stage 0.0 (TID 7) (faten-VivoBook-ASUSLaptop-X509JB-X509JB.router executor driver): com.intel.analytics.bigdl.dllib.utils.InvalidOperationException: Linear: 
 The input to the layer needs to be a vector(or a mini-batch of vectors);
 please use the Reshape module to convert multi-dimensional input into vectors
 if appropriate"
    input dim 3
    at com.intel.analytics.bigdl.dllib.utils.Log4Error$.invalidOperationError(Log4Error.scala:38)
    at com.intel.analytics.bigdl.dllib.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:291)
    at com.intel.analytics.bigdl.dllib.keras.Predictor$.$anonfun$predict$3(Predictor.scala:189)
    at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
    at org.apache.spark.api.python.SerDeUtil$AutoBatchedPickler.hasNext(SerDeUtil.scala:86)
    at scala.collection.Iterator.foreach(Iterator.scala:941)
    at scala.collection.Iterator.foreach$(Iterator.scala:941)
    at org.apache.spark.api.python.SerDeUtil$AutoBatchedPickler.foreach(SerDeUtil.scala:80)
    at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:307)
    at org.apache.spark.api.python.PythonRunner$$anon$2.writeIteratorToStream(PythonRunner.scala:621)
    at org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:397)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
    at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:232)
Caused by: com.intel.analytics.bigdl.dllib.utils.InvalidOperationException: Linear: 
 The input to the layer needs to be a vector(or a mini-batch of vectors);
 please use the Reshape module to convert multi-dimensional input into vectors
 if appropriate"
    input dim 3
    at com.intel.analytics.bigdl.dllib.utils.Log4Error$.invalidOperationError(Log4Error.scala:38)
    at com.intel.analytics.bigdl.dllib.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:288)
    at com.intel.analytics.bigdl.dllib.nn.Sequential.updateOutput(Sequential.scala:39)
    at com.intel.analytics.bigdl.dllib.nn.internal.KerasLayer.updateOutput(KerasLayer.scala:275)
    at com.intel.analytics.bigdl.dllib.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:285)
    ... 13 more
Caused by: java.lang.IllegalArgumentException: Linear: 
 The input to the layer needs to be a vector(or a mini-batch of vectors);
 please use the Reshape module to convert multi-dimensional input into vectors
 if appropriate"
    input dim 3
    at com.intel.analytics.bigdl.dllib.utils.Log4Error$.invalidInputError(Log4Error.scala:28)
    at com.intel.analytics.bigdl.dllib.nn.Linear.updateOutput(Linear.scala:85)
    at com.intel.analytics.bigdl.dllib.nn.Linear.updateOutput(Linear.scala:44)
    at com.intel.analytics.bigdl.dllib.nn.internal.KerasLayer.updateOutput(KerasLayer.scala:275)
    at com.intel.analytics.bigdl.dllib.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:285)
    ... 16 more

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2258)
    at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2207)
    at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2206)
    at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
    at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2206)
    at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1079)
    at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1079)
    at scala.Option.foreach(Option.scala:407)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1079)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2445)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2387)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2376)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:868)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2196)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2217)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2236)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2261)
    at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1030)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
    at org.apache.spark.rdd.RDD.collect(RDD.scala:1029)
    at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:180)
    at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:748)
Caused by: com.intel.analytics.bigdl.dllib.utils.InvalidOperationException: Linear: 
 The input to the layer needs to be a vector(or a mini-batch of vectors);
 please use the Reshape module to convert multi-dimensional input into vectors
 if appropriate"
    input dim 3
    at com.intel.analytics.bigdl.dllib.utils.Log4Error$.invalidOperationError(Log4Error.scala:38)
    at com.intel.analytics.bigdl.dllib.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:291)
    at com.intel.analytics.bigdl.dllib.keras.Predictor$.$anonfun$predict$3(Predictor.scala:189)
    at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
    at org.apache.spark.api.python.SerDeUtil$AutoBatchedPickler.hasNext(SerDeUtil.scala:86)
    at scala.collection.Iterator.foreach(Iterator.scala:941)
    at scala.collection.Iterator.foreach$(Iterator.scala:941)
    at org.apache.spark.api.python.SerDeUtil$AutoBatchedPickler.foreach(SerDeUtil.scala:80)
    at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:307)
    at org.apache.spark.api.python.PythonRunner$$anon$2.writeIteratorToStream(PythonRunner.scala:621)
    at org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:397)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
    at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:232)
Caused by: com.intel.analytics.bigdl.dllib.utils.InvalidOperationException: Linear: 
 The input to the layer needs to be a vector(or a mini-batch of vectors);
 please use the Reshape module to convert multi-dimensional input into vectors
 if appropriate"
    input dim 3
    at com.intel.analytics.bigdl.dllib.utils.Log4Error$.invalidOperationError(Log4Error.scala:38)
    at com.intel.analytics.bigdl.dllib.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:288)
    at com.intel.analytics.bigdl.dllib.nn.Sequential.updateOutput(Sequential.scala:39)
    at com.intel.analytics.bigdl.dllib.nn.internal.KerasLayer.updateOutput(KerasLayer.scala:275)
    at com.intel.analytics.bigdl.dllib.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:285)
    ... 13 more
Caused by: java.lang.IllegalArgumentException: Linear: 
 The input to the layer needs to be a vector(or a mini-batch of vectors);
 please use the Reshape module to convert multi-dimensional input into vectors
 if appropriate"
    input dim 3
    at com.intel.analytics.bigdl.dllib.utils.Log4Error$.invalidInputError(Log4Error.scala:28)
    at com.intel.analytics.bigdl.dllib.nn.Linear.updateOutput(Linear.scala:85)
    at com.intel.analytics.bigdl.dllib.nn.Linear.updateOutput(Linear.scala:44)
    at com.intel.analytics.bigdl.dllib.nn.internal.KerasLayer.updateOutput(KerasLayer.scala:275)
    at com.intel.analytics.bigdl.dllib.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:285)
    ... 16 more

could anyone explain why this error occured and how fix it please. Thank you in advance

user
  • 33
  • 5
  • 1
    Does this answer your question? [py4j.protocol.Py4JJavaError while running a Python script with Pyspark](https://stackoverflow.com/questions/52805571/py4j-protocol-py4jjavaerror-while-running-a-python-script-with-pyspark) – Koedlt Dec 05 '22 at 20:11
  • i saw it before posting my question. the java version used in the docker image is 1.8.0_192 so java8 not 11 – user Dec 05 '22 at 20:14
  • Oh ok! Do you happen to have a Java stack trace as well? Something like in [this](https://stackoverflow.com/q/70981458/15405732) question? If yes, could you add it to your question? – Koedlt Dec 05 '22 at 20:20

1 Answers1

1

I don't know the BigDL library, but in the Java stack trace you can find the clue to your problem:

Caused by: java.lang.IllegalArgumentException: Linear: 
 The input to the layer needs to be a vector(or a mini-batch of vectors);
 please use the Reshape module to convert multi-dimensional input into vectors
 if appropriate"
    input dim 3

Since we don't have all the code it is not possible to tell you exactly where things are going wrong, but one of the inputs of your BigDL functions has the wrong shape. My guess would be this line:

target = self.q_network.predict(state)

Look for documentation on that .predict() method and see what it expects as input. I would think that things are going wrong there.

Hope this helps!

Koedlt
  • 4,286
  • 8
  • 15
  • 33
  • Thank you for your answer. the same code was ran without any problem before. The problem began when I used BigDL . I beleive that the problem is that Bigdl uses rdd i.e the results of predict() are rdds. I tried to solve the problem using collect() however the problem remains the same – user Dec 05 '22 at 20:55