How to store data in MS SQL Sever table using PySpark

Asked Sep 30 '15 at 14:13

Active Nov 12 '15 at 11:14

Viewed 230 times

I am processing some raw files using PySpark and returning results in a list so something like this

def myfunction(lines):
    splitlines = lines.split(",")
    firstvalue = splitlines[0]
    secondvalue = splitlines[1]
    return firstvalue, secondvalue


sqlContext = SQLContext(sc)
listoflines = sc.textFile("myfilesdirectory/*").map(myfunction).

I would like to insert firstvalue and secondvalue into a MS SQL table directly in Spark as I retrieved them from Spark. Please note that I am doing many other things in my "myfucntion" and I can not use simple split statement in a lambda function.

edited Nov 12 '15 at 11:14

zero323

322,348
103
959
935

asked Sep 30 '15 at 14:13

van

Well I have been looking into documents of Spark SQL and I could not find anything related to my question. That is the reason I asked here. Perhaps somebody had tried already and got it working or somebody knows how to do it ? – van Sep 30 '15 at 17:36
Have you read docs? http://spark.apache.org/docs/latest/sql-programming-guide.html#jdbc-to-other-databases – Dawid Wysakowicz Sep 30 '15 at 18:11

How to store data in MS SQL Sever table using PySpark

0 Answers0