How can I specify number of ALS worker threads in spark?

Asked Dec 10 '18 at 15:51

Active Dec 10 '18 at 15:51

Viewed 29 times

I use this simple code to calculate recommendation from command line java app:

    SparkSession spark = SparkSession
            .builder()
            .appName("SomeAppName")
            .config("spark.master", "local")
            .config("spark.executor.instances",1) // ??
            .config("spark.executor.cores",3) // ??
            .getOrCreate();
    JavaRDD<Rating> ratingsRDD = spark
            .read().textFile(args[0]).javaRDD()
            .map(Rating::parseRating);
    Dataset<Row> ratings = spark.createDataFrame(ratingsRDD, Rating.class);
    ALS als = new ALS()
            .setMaxIter(1)
            .setRegParam(0.01)
            .setUserCol("userId")
            .setItemCol("movieId")
            .setRatingCol("rating");
    ALSModel model = als.fit(ratings);
    model.setColdStartStrategy("drop");
    Dataset<Row> rowDataset = model.recommendForAllUsers(50);

But this code uses just 100% of CPU (800% CPU usage observed with other apps), how can I increase number of threads correctly?

asked Dec 10 '18 at 15:51

Stepan Yakovenko

8,670
28
113
206

(in other words don't use `local` master) – 10465355 Dec 10 '18 at 16:07
I want 8 kernels, what else should I specify? – Stepan Yakovenko Dec 11 '18 at 05:12
2

Quoting [linked thread](https://stackoverflow.com/a/46676635/10465355): _`local[K]` : Run Spark locally with K worker threads (ideally, set this to the number of cores on your machine)_ therefore `local[8]`. – 10465355 Dec 11 '18 at 15:34
If you put it as an anwer, i'll accept it – Stepan Yakovenko Dec 24 '18 at 15:56

How can I specify number of ALS worker threads in spark?

0 Answers0