1

Can anyone suggest me a good tutorial to set up spark in my machine which is remotely accessing other machine in which hadoop is installed .

Chitral Verma
  • 2,695
  • 1
  • 17
  • 29

2 Answers2

0

What you need is a client setup. The hadoop distribution you're planning to connect to may have a client setup in their docs. Like MapR has mapr-client.

Once that is in place, follow any of these to setup spark,

How to set up Spark on Windows?

Running apache Spark on windows

http://www.ics.uci.edu/~shantas/Install_Spark_on_Windows10.pdf

Let me know if this helps. Cheers.

Chitral Verma
  • 2,695
  • 1
  • 17
  • 29
0

I would suggest you to develop with Spark using IntelliJ IDEA on your Windows. Create a SBT project where you can copy the next code on the build file, and it will download all the dependencies for you.

version := "1.0"
scalaVersion := "2.10.6"
// grading libraries
libraryDependencies += "junit" % "junit" % "4.10" % "test"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "1.6.3",
  "org.apache.spark" %% "spark-sql" % "1.6.3",
  "org.apache.spark" %% "spark-hive" % "1.6.3"
)

libraryDependencies ++= Seq(
  "org.apache.commons" % "commons-csv" % "1.4",
  "joda-time" % "joda-time" % "2.9.9",
  "com.univocity" % "univocity-parsers" % "1.5.1"
)
libraryDependencies +="com.databricks" %% "spark-csv" % "1.5.0"

After that create a scala object, and begin developping. It is mainly for local development in Spark. Pay attention to the paths when reading or writing files.

OUMOUSS_ELMEHDI
  • 499
  • 5
  • 16