Questions tagged [sparkapi]

sparkapi is a package providing core RPC protocol extracted from SparkR

sparkapi provides a core RPC protocol extracted from . It is designed for low level interactions with .

External links:

Related tags: ,

10 questions
5
votes
4 answers

Can sparklyr be used with spark deployed on yarn-managed hadoop cluster?

Is the sparklyr R package able to connect to YARN-managed hadoop clusters? This doesn't seem to be documented in the cluster deployment documentation. Using the SparkR package that ships with Spark it is possible by doing: # set R environment…
Matt Pollock
  • 1,063
  • 10
  • 26
3
votes
1 answer

Connect R with Spark in Rstudio-Failed to launch Spark shell. Ports file does not exist

I am trying to connect R with a local instance of Spark using Rstudio. However, I get the error message shown. What am I missing? I am using windows 10. I am following the tutorial on rstudio. library(sparklyr) spark_install(version = "1.6.1") …
Fisseha Berhane
  • 2,533
  • 4
  • 30
  • 48
2
votes
0 answers

Can Numba eccalerate a udf (pyspark)

We use Spark through Spark API with pyspark. I know Numba can make python code very fast so it is a good thing to use it on our udfs (user-defined function) but I'm not sure Numba decorator still works on the executors, for example with map or…
idan ahal
  • 707
  • 8
  • 21
1
vote
3 answers

Issue while trying to read a text file in databricks using Local File API's rather than Spark API

I'm trying to read a small txt file which is added as a table to the default db on Databricks. While trying to read the file via Local File API, I get a FileNotFoundError, but I'm able to read the same file as Spark RDD using SparkContext. Please…
Riyaz Ali
  • 43
  • 4
0
votes
1 answer

Spark : Is there an equivalent to spark SQL's LATERAL VIEW in the Spark API?

Title says it all: Is there an equivalent to the SPARK SQL LATERAL VIEW command in the Spark API so that I can generate a column from a UDF that contains a struct of multiple columns worth of data, and then laterally spread the columns in the struct…
Rimer
  • 2,054
  • 6
  • 28
  • 43
0
votes
1 answer

Spark API: Use column value in LIKE statement

In spark API: column.like("only takes a static string with optional wildcards like %") column.contains(accepts_a_column_but_wont_parse_wildcards) So what's the equivalent method to call to compare values using wildcards that might show up in a…
Rimer
  • 2,054
  • 6
  • 28
  • 43
0
votes
0 answers

Rename Hadoop server tables in pyspark/Spark API in python

for elem in list: final = sqlCtx.read.table('XXX.YYY') interim = final.join(elem,'user_id', "fullouter") final = interim.select(['user_id'] + [ spark_combine_first(final[c], elem[c]).alias(c) for c in dup_collect(interim.columns)[0]…
jayesh
  • 37
  • 4
0
votes
0 answers

Can I use Spark's REST API to get the version of Spark on the Workers

I know I can get the version of Spark v2.2.1 that's running on Spark Master with this command: http://:4040/api/v1/version which will return something like { "spark" : "2.2.1" } However, I also want to check the version of Spark…
liltitus27
  • 1,670
  • 5
  • 29
  • 46
0
votes
1 answer

Getting Null data in RDD while pulling data from HBase

I have requirement to pull data from HBase using Spark API and query on the top of the data just like SparkSQL. Thing that I did as follows: Created Spark conf object Created HBase object Wrote JAVPairRDD to fetch records. My Main Class code is…
Vijay_Shinde
  • 1,332
  • 2
  • 17
  • 38
0
votes
1 answer

java spark word matches between two strings

I would like to know if there is some coincidence between words of two different long strings with SPARK (Java Api). String string1 = "car bike bus ..." (about 100 words); String string2 = "boat plane car ..." (about 100 words); How could I do…