24

I am new to Spark and trying to figure out how can I use the Spark shell.

Looked into Spark's site documentation and it doesn't show how to create directories or how to see all my files in spark shell. If anyone could help me I would appreciate it.

mrsrinivas
  • 34,112
  • 13
  • 125
  • 125
Nick
  • 2,818
  • 5
  • 42
  • 60

2 Answers2

63

In this context you can assume that Spark shell is just a normal Scala REPL so the same rules apply. You can get a list of the available commands using :help.

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0
      /_/

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_151)
Type in expressions to have them evaluated.
Type :help for more information.

scala> :help
All commands can be abbreviated, e.g., :he instead of :help.
:edit <id>|<line>        edit history
:help [command]          print this summary or command-specific help
:history [num]           show the history (optional num is commands to show)
:h? <string>             search the history
:imports [name name ...] show import history, identifying sources of names
:implicits [-v]          show the implicits in scope
:javap <path|class>      disassemble a file or class name
:line <id>|<line>        place line(s) at the end of history
:load <path>             interpret lines in a file
:paste [-raw] [path]     enter paste mode or paste a file
:power                   enable power user mode
:quit                    exit the interpreter
:replay [options]        reset the repl and replay all previous commands
:require <path>          add a jar to the classpath
:reset [options]         reset the repl to its initial state, forgetting all session entries
:save <path>             save replayable session to a file
:sh <command line>       run a shell command (result is implicitly => List[String])
:settings <options>      update compiler options, if possible; see reset
:silent                  disable/enable automatic printing of results
:type [-v] <expr>        display the type of an expression without evaluating it
:kind [-v] <expr>        display the kind of expression's type
:warnings                show the suppressed warnings from the most recent line which had any

As you can see above you can invoke shell commands using :sh. For example:

scala> :sh mkdir foobar
res0: scala.tools.nsc.interpreter.ProcessResult = `mkdir foobar` (0 lines, exit 0)

scala> :sh touch foobar/foo
res1: scala.tools.nsc.interpreter.ProcessResult = `touch foobar/foo` (0 lines, exit 0)

scala> :sh touch foobar/bar
res2: scala.tools.nsc.interpreter.ProcessResult = `touch foobar/bar` (0 lines, exit 0)

scala> :sh ls foobar
res3: scala.tools.nsc.interpreter.ProcessResult = `ls foobar` (2 lines, exit 0)

scala> res3.line foreach println
line   lines

scala> res3.lines foreach println
bar
foo
Community
  • 1
  • 1
zero323
  • 322,348
  • 103
  • 959
  • 935
  • 1
    I get this weird error - any idea why?`scala> :sh ls res5: scala.tools.nsc.interpreter.ProcessResult = `ls` (2 lines, exit 0)` `scala> res5 foreach println` `:12: error: value foreach is not a member of scala.tools.nsc.interpreter.ProcessResult` ` res5 foreach println` – WoodChopper Sep 28 '15 at 04:36
  • 5
    WoodChopper just do `res5.lines foreach println` – Ryan Hartman Nov 10 '15 at 04:30
  • `res3 foreach println` does not work and should be `res3.lines foreach println` instead. – Holger Brandl Sep 21 '17 at 09:11
  • 1
    @HolgerBrandl Thanks. This is a pretty old answer (1.x, which used Scala 2.10) and it was possible to use foreach directly back then. Updated. – zero323 Sep 21 '17 at 13:05
  • That doesn't work though. `:sh cd /home/dean/bin/spark-2.3.0-bin-hadoop2.7 java.io.IOException: Cannot run program "cd": error=2, No such file or directory` – Dean Schulze Jan 10 '19 at 13:55
10

Click for image description here

:q or :quit command is used to exit from your scala REPL.

atline
  • 28,355
  • 16
  • 77
  • 113
Shekh Firoz Alam
  • 192
  • 3
  • 15