Questions tagged [parquet.net]

15 questions
6
votes
1 answer

Writing Parquet files using Parquet.NET works with local file, but results in empty file in blob storage

We are using parquet.net to write parquet files. I've set up a simple schema containing 3 columns, and 2 rows: // Set up the file structure var UserKey = new Parquet.Data.DataColumn( new DataField("UserKey"), …
SchmitzIT
  • 9,227
  • 9
  • 65
  • 92
2
votes
0 answers

How can I convert parquet-dotnet's columns to individual models?

parquet-dotnet has an example I'm trying to work with that looks like this: using (Stream fileStream = System.IO.File.OpenRead("c:\\test.parquet")) { using (var parquetReader = new ParquetReader(fileStream)) { DataField[] dataFields =…
ernest
  • 1,633
  • 2
  • 30
  • 48
2
votes
1 answer

Read first 100 rows from Parquet file in C#

I have these huge parquet files, stored in a blob, with more than 600k rows and I'd like to retrieve the first 100 so I can send them to my client app. This is the code I use now for this functionality: private async Task < Table >…
anthino12
  • 770
  • 1
  • 6
  • 29
1
vote
0 answers

Reading parquet files using Parquet.NET takes more time than pyarrow (python)

Usually when it comes to parquet file operations,Parquet.Net package takes less/equal time compared to python. But my initial set of experiments doesn't align with that. To read 5 million data points in parquet python takes around 1 second while the…
1
vote
0 answers

Unable to write to new parquet file, cant figure out how to convert int elements in array to strings (Convert from System.Array[] to String[])

I'm working with some parquet files where I'm reading a file doing some stuff and then adding new columns with the original columns to a new parquet file. using Stream fileStream = File.OpenRead(sourceFile); using var parquetReader = new…
Laende
  • 167
  • 2
  • 13
1
vote
2 answers

Transfering huge amount of data from SQL Server into parquet file

Recently I have been challenged with the task to create a process, which extracts data from a SQL Server DB and writes it to parquet files. I searched online and found various examples, which load the data into a DataTable and then write the data…
Tyron78
  • 4,117
  • 2
  • 17
  • 32
1
vote
1 answer

Parquet ReadAsTable() method takes too long for big files

I have this code snippet: private Table getParquetAsTable(BlobClient blob) { var stream = blob.OpenRead(); var parquetReader = new ParquetReader(stream); return parquetReader.ReadAsTable(); } whit this code does is it reads a…
anthino12
  • 770
  • 1
  • 6
  • 29
1
vote
0 answers

Parquet.NET is generating huge parquet files in comparison with pyarrow

My application takes data from Azure EventHubs, which has a maximum of 1mb size, transforms it into a DataTable and then save it as a Parquet file somewhere. The parquet generated by Parquet.Net is huge, it is always over 50mb even with the best…
Flavio Pegas
  • 388
  • 1
  • 9
  • 26
0
votes
0 answers

Reading parquet file error 'Destination is too short' with Parquet.Net

In this project, there is a C# API, where I need to build a simple program that reads a parquet file and returns it in json form. Normally I use python, reading a parquet file in python is as simple as 1 line -- but I'm stuck with C# (beginner).…
pyeR_biz
  • 986
  • 12
  • 36
0
votes
0 answers

Parquet file not writing correctly

I've recently upgraded Parquet.Net to version 4.6.0, which necessitated changing a lot of the method calls to their asynchronous versions. This code creates a file without any thrown errors: string file = @"c:\temp\test.parquet"; var…
mcmillab
  • 2,752
  • 2
  • 23
  • 37
0
votes
0 answers

How do I specify the type of data in the parquet file I created with the Parquet.Net package?

How to specify data type of Parquet.Net's parquet writer? I'm trying to develop a converter from csv/xml to parquet file with Winform (.NET 6). I have to determine my data as BYTE_ARRAY (also known Binary in Parquet, i guess) I used this repo as…
0
votes
1 answer

Writing a Parquet.Net file using RLE_DICTIONARY encoding

The Parquet.Net specification says I can read and write in RLE_DICTIONARY encoding. I am trying to read the docs of Parquet.Net and the github repo code, but how do I write my DataTable to use this encoding? The demo I am basing this off of is found…
cdub
  • 24,555
  • 57
  • 174
  • 303
0
votes
0 answers

How to create a dynamic schema for parquet from a SQL Server dynamic query of tables?

This question is about Parquet.Net: https://github.com/aloneguid/parquet-dotnet I am trying to use Parquet.Net to build a parquet file from a couple of database tables that I have joined together. The table columns can be dynamic, so I can create a…
cdub
  • 24,555
  • 57
  • 174
  • 303
0
votes
1 answer

How to create schema for parquet.net using the .net data table

I have one data table which I want to convert to parquet file and upload to blob storage. But i don't have the static schema so how can i do that ?
0
votes
2 answers

How do I use LINQ on array produced by reading with Parquet.net?

I am not experienced with C#. I need to read a parquet file and then use LINQ to query the data read from the file. I don't know if I need to deserialise. The following is the data in the parquet file The data is being read into the 'records'…
Victor
  • 13
  • 1
  • 3