0

in sparkR API there are functions with the same name as in R. Some of the examples are abs,cosine functions.

What is the difference between abs function in R and in sparkR. when does the abs function get executed in spark?

documentation for sparkR abs function http://spark.apache.org/docs/latest/api/R/abs.html

DesirePRG
  • 6,122
  • 15
  • 69
  • 114
  • 1
    I believe that [this link](http://stackoverflow.com/questions/5564564/r-2-functions-with-the-same-name-in-2-different-packages) may be of interest to you since it discusses what happens when you have two packages with functions having the same name (in this case `baseR` and `sparkR`). – Tim Biegeleisen Sep 21 '15 at 05:24

2 Answers2

4

The difference is where the function lives.

In base R, abs is a primitive:

function(x) .Primitive("abs")

In Spark, abs is a wrapper around a call to the Spark engine:

setMethod("abs",
          signature(x = "Column"),
          function(x) {
            jc <- callJStatic("org.apache.spark.sql.functions", "abs", x@jc)
            column(jc)
          })

You can see the R source code for the SparkR package here.

shadowtalker
  • 12,529
  • 3
  • 53
  • 96
  • Is there a way to get to know, where a function got executed? I mean in local R process or spark – DesirePRG Sep 21 '15 at 06:17
  • @DesirePRG I imagine that only functions involving calls like `callJStatic` run in Spark itself, but I'm not a Spark user myself so don't quote me on it. – shadowtalker Sep 21 '15 at 07:19
0

In base R,it can be applied to any vector, but in SparkR, it can only be applied to columns.Suppose you have a dataframe and C3 column is double, you can use following code to add one column C4 to dataframe, which is the abs value for C3.

df$C4 <- abs(df$C3)

or

withColumn(df,"absvalue",abs(df&C3))

I think the most difference between base R and SparkR is that in SparkR, the min unit you can conduct is column, not as vector or matrix. I am just a newcomer for SparkR, I am still learning.

Yuan Tian
  • 23
  • 6