2

I know in Scala , you can read in a parquet file as follows:

//Create Spark Context
val sparkConf = new SparkConf().setAppName(appName).setMaster(sparkMaster)
val sc = new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._

 val pf = 
      sqlContext.read.parquet(hdfsDataUri + "test.parquet")
 pf.registerTempTable("test")

Is there a way to do this using Mobius (C# API for Spark)? I could only find a way to read in CSV files. Ref: https://github.com/Microsoft/Mobius

zero323
  • 322,348
  • 103
  • 959
  • 935
user2608613
  • 352
  • 2
  • 11

2 Answers2

2

C# API for using Parquet in Apache Spark is available in Mobius. Following is the C# implementation of the Apache Spark Scala code in your question:

        var sparkConf = new SparkConf().SetAppName(appName).SetMaster(sparkMaster);
        var sc = new SparkContext(sparkConf);
        var sqlContext = new SqlContext(sc);
        var pf = sqlContext.Read().Parquet(hdfsDataUri + "test.parquet");
        pf.RegisterTempTable("test");
skaarthik
  • 377
  • 2
  • 6
2

You can read and write parquet files directly in .NET via https://github.com/elastacloud/parquet-dotnet

Ivan G.
  • 5,027
  • 2
  • 37
  • 65
  • This is the better answer. Since Parquet.NET v3 release, arguably the best way to access Parquet. – Snympi Sep 16 '19 at 08:33