How to avoid output stream of Python in Scala?

Question

I would like to initiate the spark context in python from scala.

I have added package 'pyspark' to do this. This is the code which I have tried and this works fine.

Code snippet:

import sys.process._

var os: java.io.OutputStream = _
val python = Process(Seq("python","-i")).run(BasicIO.standard(os = _))

def pushLine(s: String): Unit = {
  os.write(s"$s\n".getBytes("UTF-8"))
  os.flush()
}

pushLine("from pyspark import SparkContext, SparkConf;from pyspark.sql import SQLContext;conf = SparkConf().setAppName('test').setMaster('local');sc = SparkContext(conf=conf);sqlContext = SQLContext(sc);")

Now, my requirement is to avoid the output stream that gets displayed in scala. Is there any option to avoid this ?

Thanks in advance :)

http://stackoverflow.com/questions/25193488/how-to-turn-off-info-logging-in-pyspark try this — vijay kumar, Jul 28 '15 at 14:30
Thanks for your reply @vijay..... I have modified the property **log4j.rootCategory=WARN, console** in log4j template, but still I could see the logs generated. Is there any other options available ? — Ramkumar, Jul 31 '15 at 04:00

score 1 · Accepted Answer · answered Jul 28 '15 at 14:44

the below method worked for me.

create a file log4j.properties in some directory say /home/vijay/py-test-log

log4j.rootCategory=WARN, console
log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
cd /home/vijay/py-test-log // log4j.props file should be here
then lauch pyspark from this directory in which u have log4j.properties

$pwd
/home/vijay/py-test-log
$/usr/lib/spark-1.2.0-bin-hadoop2.3/bin/pyspark
That all done. pyspark will load the log4j.props file from where it is launched.

How to avoid output stream of Python in Scala?

1 Answers1