1

I need to use SparkContext instead of JavaSparkContext for the accumulableCollection (if you don't agree check out the linked question and answer it please!)

Clarified Question: SparkContext is available in Java but wants a Scala sequence. How do I make it happy -- in Java?

I have this code to do a simple jsc.parallelize I was using with JavaSparkContext, but SparkContext wants a Scala collection. I thought here I was building a Scala Range and converting it to a Java list, not sure how to get that core Range to be a Scala Seq, which is what the parallelize from SparkContext is asking for.

    // The JavaSparkContext way, was trying to get around MAXINT limit, not the issue here
    // setup bogus Lists of size M and N for parallelize
    //List<Integer> rangeM = rangeClosed(startM, endM).boxed().collect(Collectors.toList());
    //List<Integer> rangeN = rangeClosed(startN, endN).boxed().collect(Collectors.toList());

The money line is next, how can I create a Scala Seq in Java to give to parallelize?

    // these lists above need to be scala objects now that we switched to SparkContext
    scala.collection.Seq<Integer> rangeMscala = scala.collection.immutable.List(startM to endM);

    // setup sparkConf and create SparkContext
    ... SparkConf setup
    SparkContext jsc = new SparkContext(sparkConf);

    RDD<Integer> dataSetMscala = jsc.parallelize(rangeMscala);
Community
  • 1
  • 1
JimLohse
  • 1,209
  • 4
  • 19
  • 44
  • I am looking at the [JavaConversions](http://docs.scala-lang.org/overviews/collections/conversions-between-java-and-scala-collections) object, looks like it works in both directions, in Java or in Scala? – JimLohse Mar 14 '16 at 22:30
  • 1
    I doubt you can create a `Seq` in Java, for it is a `trait` and has no equivalent in Java. I think using `JavaConversions` in Scala is the right way. – David S. Mar 15 '16 at 12:51
  • I think maybe I am duplicating [this](http://stackoverflow.com/questions/35988315/convert-java-list-to-scala-seq?rq=1), I am gonna try the solution and will post an answer if it works, seems like JavaConversion can be used in Java, if I read this correctly: http://stackoverflow.com/questions/35988315/convert-java-list-to-scala-seq?rq=1 – JimLohse Mar 15 '16 at 14:39
  • Thanks @davidshen84 maybe I am not understanding, I am trying to do this in Java. I probably will just switch to using Scala, just that I am trying to finish off an important stage in a project and don't want to learn a new language quite yet :) When you say a "trait" is that like a generic in Java, s.t. we use: `List aList = new ArrayList<>`()? I am looking at this now, to see what instantiates a Scala Seq, it looks like List. So SparkContext asks for a Seq and I assume if I can create a Scala List it would accept it. – JimLohse Mar 15 '16 at 18:58
  • 1
    `trait` in Scala is a hybrid of *interface* and *abstract class* in Java. A Scala List implements `Seq` trait. If you can create a Scala List in Java, I guess it could work for you. I never tried so...good luck :) – David S. Mar 16 '16 at 02:18
  • I am assuming the existence of a SparkContext class (written in Scala like all Spark) in the Java docs indicates that it's possible. Thanks again, I will go read about traits, it's about time I start using Scala, it's so much better for Spark – JimLohse Mar 16 '16 at 04:44

1 Answers1

0

You should use it this way:

scala.collection.immutable.Range rangeMscala = 
  scala.collection.immutable.Range$.MODULE$.apply(1, 10);

SparkContext sc = new SparkContext();

RDD dataSetMscala = 
  sc.parallelize(rangeMscala, 3, scala.reflect.ClassTag$.MODULE$.Object());

Hope it helps! Regards

opuertas
  • 157
  • 1
  • 5
  • I appreciate the help, I clarified my question by adding this line: Clarified Question: SparkContext is available in Java but wants a Scala sequence. How do I make it happy -- in Java? – JimLohse Mar 15 '16 at 14:36
  • Ok, you are right, I misunderstood the question; I modified my answer to fit it to your needs, please, check it. – opuertas Mar 17 '16 at 08:55
  • I appreciate that, I was not the dv but I am the uv that evened it out :) Will try this a little later and let you know. Looks good so far. – JimLohse Mar 17 '16 at 14:08
  • Very cool it works! Thank you. I am gonna have to learn about using SparkContext, it looks like I can still call JavaPairRDD on it? I'll post another question if I don't figure it out :) – JimLohse Mar 17 '16 at 21:52
  • Hey sorry I thought I accepted this last week my apologies! Thanks again. – JimLohse Mar 21 '16 at 16:53