0

I have a newly installed hadoop2.8 for spark2.2.1. It throws spark java.lang.NumberFormatException: For input string: "100M" when I enter pyspark.

I am following this question for my solution .

Additional Info: I am trying to create spark sessions with AWS ARN roles to that spark can access different data sources with Assume role capability in AWS.

Edit: Installed hadoop2.8 for spark2.2.1; Previously had hadoop2.7 as a default but it doesn't support aws roles for spark sessions.

  • What Hadoop versions you have? Did you configure the same version with AWS? – pvy4917 Oct 15 '18 at 16:40
  • I Just installed hadoop 2.8.3. The here are the versions : `hadoop-aws-2.7.3.jar` and `hadoop-common-2.8.3.jar` – knows not much Oct 15 '18 at 17:08
  • That is the problem. Match the versions. – pvy4917 Oct 15 '18 at 17:11
  • Thanks, I don't see that error message anyone, but it gives a pile of new once to match multiple jar versions. The latest one is ` An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.lang.NoClassDefFoundError: com/amazonaws/services/s3/model/AmazonS3Exception` – knows not much Oct 15 '18 at 19:24
  • What version you are pointing AWS to? – pvy4917 Oct 15 '18 at 19:26
  • here is the AWS CLI version `aws-cli/1.15.80` and the jar `aws-java-sdk-1.10.6.jar` – knows not much Oct 15 '18 at 19:35
  • The problem was in the Hadoop Version I had installed, I was working with it before but after changing to S3 I thought it wasn't used anymore, but Hive needs it. I have took your advice and I have started playing with the source code. Thanks. :) – pvy4917 Oct 15 '18 at 19:38
  • Read the above comment. It is referencing to the SO page. – pvy4917 Oct 15 '18 at 19:39
  • Resolved the issue, We were missing `aws-java-sdk-bundle-1.11.366.jar` file. Thanks for the quick responses Prazy!! – knows not much Oct 16 '18 at 20:24
  • Glad! It worked. You are welcome. – pvy4917 Oct 16 '18 at 20:26

0 Answers0