Questions tagged [.net-spark]

Questions pertaining to usage of Apache Spark (and related distributions) in the context of Microsoft's .NET runtime and associated languages such as C# and F#. Feel free to add platform specific and language specific tags as well.

Tag Definition

Tag is used for questions pertaining to usage of Apache Spark (and related distributions) in the context of Microsoft's .NET runtime and associated languages such as C# and F#.

Related platform/code offerings

.NET for Apache Spark is currently an open source offering at the .NET Foundation. See https://github.com/dotnet/spark and https://dot.net/spark for details.

Refinement Usage of Tag

You can refine the tag's usage by adding tags narrowing down the relevant Apache Spark related distribution and services and the specific language(s) relevant to the question.

24 questions
6
votes
1 answer

How to pass array column as argument in VectorUdf in .Net Spark?

I'm trying to implement Vector Udf in C# Spark. I have created .Net Spark environment by following Spark .Net. Vector Udf (Apache arrow and Microsoft.Data.Analysis both) worked for me for IntegerType column. Now, trying to send the Integer array…
6
votes
1 answer

Apache spark queries through C#

I was wondering if there is a way I can use C# to write queries to run on Apache spark. I know spark SQL queries can be written in java/scala/python. Is there any interface for c#?
Groot
  • 311
  • 4
  • 15
3
votes
1 answer

Forward filling in .NET for Spark

I am looking at the window function for a Spark DataFrame in .NET (C#). I have a DataFrame df with columns Year, Month, Day, Hour, Minute, ID, Type and Value: | 2021 | 3 | 4 | 8 | 9 | 87 | Type1 | 380.5 | | 2021 | 3 | 4 | 8 | …
V. J.
  • 35
  • 5
3
votes
1 answer

How to run .Net spark jobs on Databricks from Azure Data Factory?

In Azure data factory, you have a Databricks Acvitiy. This activity supports running python, jar and notebooks. And These notebooks may be written in scala, python, java, and R but not c#/.net. Is there inherent or direct support where I can write…
3
votes
0 answers

Dotnet Apache Spark - Object reference not set to an instance of an object

I've been trying to register and run a UDF with dotnet apache spark. I am using Microsoft.Spark.0.10.0 on MacOs This is what I've been trying to do var options = new Dictionary { {"delimiter", "|" } …
2
votes
2 answers

Create dataframe from C# List - Spark for .NET

I am currently new to .NET for Spark and need to append a C# list to a delta table. I assume I first need to create a Spark DataFrame to do this. In the sample code how would I go about appending "names" to the dataframe "df"? It seems now this has…
ow123
  • 21
  • 1
  • 2
1
vote
1 answer

Load fixed position file with multiple sections using .net-spark

I'm trying to load a fixed-position file with multiple sections in spark using .net-spark. Here is an example of the file: 01Nikola Tesla tesla@gmail.com …
Bruno Moreira
  • 175
  • 3
  • 15
1
vote
0 answers

dotnet-spark exception in writing to parquet file

I'm just trying out dotnet spark. I modified the sample program to write the DataFrame contents into a parquet file. However I am getting an exception which does not seem to have a helpful info. May I know what may be causing the exception? Or is…
remondo
  • 318
  • 2
  • 7
1
vote
1 answer

UnitTest for .NET Apache Spark

I want to write unit tests for my Spark Application written in C#/.NET. I'm currently using XUnit for writing tests but I haven't found any good documentation for writing unit tests to test my spark application components. I have written a spark…
1
vote
1 answer

CreateDataFrame with F#

I'm trying to create a simple Spark DataFrame with F# as it is used in Spark.Net test let schema = StructType ( [| StructField("Name", new StringType()) StructField("Age", new IntegerType()) …
dr11
  • 5,166
  • 11
  • 35
  • 77
0
votes
1 answer

Azure Synapse .NET C# Sparkpool: Fail to start interpreter

When I am working on a .NET Spark (C#) Notebook in Azure Synapse I always get the following error message: Fail to start interpreter. detail: org.apache.spark.api.dotnet.DotnetBackend. When changing the language from .NET Spark (C#) to Python or…
tomotom12
  • 46
  • 6
0
votes
1 answer

Loop across a dataframe in .NET Spark

I have a dataframe(created by reading a csv) in Spark, how do I loop across the rows in this dataframe in C#. There are 10 rows and 3 columns in the dataframe and I would like to get the value for each of the column as I navigate through the rows…
Joseph
  • 530
  • 3
  • 15
  • 37
0
votes
1 answer

How to unit test dotnet spark df without installing spark

I have a simple dotnet spark app and I have tried to break it down into units for testing. A sample unit, public DataFrame filtermyname(DataFrame df, string name) { return df.Filter(“name”==name); } Since unit test should not have external…
Selva
  • 951
  • 7
  • 23
0
votes
1 answer

In C# (.Net for Spark), how to use When() method as a condition to add new column to a DataFrame?

I have got some experience in pyspark. When our team is migrating the Spark project from python to C# (.Net for Spark). I'm encountering problems: Suppose we have got a Spark dataframe df with an existing column as col1. In pyspark, I could do…
MiffyW
  • 21
  • 3
0
votes
1 answer

Recursive calculation on DataFrame using .NET for Spark

I want to calculate RSI using .NET for Spark. Formula for RSI is: RSI = 100 - 100/(1 + R)S RS = Average Gain / Average Loss The first average gain and average loss are 14-period averages: First Average Gain = Sum of Gains over the past 14 periods…
1
2