Highest Voted 'spark-avro' Questions

19

votes

2 answers

Avro multiple record of same type in single schema

I like to use the same record type in an Avro schema multiple times. Consider this schema definition { "type": "record", "name": "OrderBook", "namespace": "my.types", "doc": "Test order update", "fields": [ { …

avro spark-avro

asked Jan 04 '18 at 17:31

Daniel

1,522
1
12
25

16

votes

1 answer

Read an unsupported mix of union types from an Avro file in Apache Spark

I'm trying to switch from reading csv flat files to avro files on spark. following https://github.com/databricks/spark-avro I use: import com.databricks.spark.avro._ val sqlContext = new org.apache.spark.sql.SQLContext(sc) val df =…

scala apache-spark apache-spark-sql spark-avro

asked Apr 20 '16 at 10:39

Zahiro Mor

1,708
1
16
30

9

votes

3 answers

How to use spark-avro package to read avro file from spark-shell?

I'm trying to use the spark-avro package as described in Apache Avro Data Source Guide. When I submit the following command: val df = spark.read.format("avro").load("~/foo.avro") I get an error: java.util.ServiceConfigurationError:…

apache-spark apache-spark-sql avro spark-avro

asked Apr 26 '19 at 18:16

sahibeast

341
3
13

7

votes

1 answer

How to convert a struct field in a Row to an avro record in Spark Java

I have a use case where I want to convert a struct field to an Avro record. The struct field originally maps to an Avro type. The input data is avro files and the struct field corresponds to a field in the input avro records. Below is what I want to…

java apache-spark avro spark-avro

asked Sep 16 '20 at 17:30

JBT

8,498
18
65
104

7

votes

3 answers

Provider org.apache.spark.sql.avro.AvroFileFormat could not be instantiated

Unable to send avro format message to Kafka topic from spark streaming application. Very less information is available online about avro spark streaming example code. "to_avro" method doesn't require avro schema then how it will encode to avro…

apache-spark spark-streaming-kafka spark-avro

asked Dec 26 '19 at 10:37

amitwdh

661
2
9
19

7

votes

3 answers

How to query datasets in avro format?

this works with parquet val sqlDF = spark.sql("SELECT DISTINCT field FROM parquet.`file-path'") I tried the same way with Avro but it keeps giving me an error even if i use com.databricks.spark.avro. When I execute the following query: val sqlDF…

apache-spark apache-spark-sql spark-avro

asked Sep 26 '17 at 19:20

Akrem

90
1
5

6

votes

1 answer

Deserialize Avro Spark

I'm pushing a stream of data to Azure EventHub with the following code leveraging Microsoft.Hadoop.Avro.. this code runs every 5 seconds, and simply plops the same two Avro serialised items : var strSchema = File.ReadAllText("schema.json"); var…

c# apache-spark azure-databricks azure-stream-analytics spark-avro

asked Nov 07 '19 at 16:36

m1nkeh

1,337
23
45

6

votes

4 answers

How to create an empty dataFrame in Spark

I have a set of Avro based hive tables and I need to read data from them. As Spark-SQL uses hive serdes to read the data from HDFS, it is much slower than reading HDFS directly. So I have used data bricks Spark-Avro jar to read the Avro files from…

scala apache-spark apache-spark-sql avro spark-avro

asked May 30 '18 at 13:53

Vinay Kumar

1,664
2
15
19

6

votes

2 answers

How to convert bytes from Kafka to their original object?

I am fetching data from Kafka and then deserialize the Array[Byte] using default decoder, and after that my RDD elements looks like (null,[B@406fa9b2), (null,[B@21a9fe0) but I want my original data which have a schema, so how can I achieve this? I…

apache-spark apache-kafka spark-streaming spark-avro

asked May 31 '17 at 11:05

JSR29

354
1
5
17

6

votes

3 answers

create hive external table with schema in spark

I am using spark 1.6 and I aim to create external hive table like what I do in hive script. To do this, I first read in the partitioned avro file and get the schema of this file. Now I stopped here, I get no idea how to apply this schema to my…

apache-spark hive apache-spark-sql spark-avro

asked Jul 27 '16 at 16:51

G_cy

994
3
13
28

5

votes

0 answers

Invalid sync error while reading avro file using spark or hive

I have an avro file which is created using JAVA api, when the writer was writing data in file the program shut down ungracefully due to machine reboot. Now when I am trying to read this file using spark/hive, it reads some data and then throws…

hive avro spark-avro avro-tools

asked Jul 01 '20 at 14:44

User_qwerty

375
1
2
10

5

votes

1 answer

Spark from_avro() dataframe.show() errors java.lang.ArrayIndexOutOfBoundsException

I converted an dataframe fields to avro field struct using to_avro, and back using from_avro like below. Ultimately I want to stream the avro payload to kafka write/read. When I tried to print the final reconverted dataframe by doing df.show() it…

scala apache-spark apache-kafka avro spark-avro

asked Jun 15 '20 at 17:44

Anand K

293
3
12

5

votes

2 answers

Spark on Cluster: Read Large number of small avro files is taking too long to list

I know this problem of reading large number of small files in HDFS have always been an issue and been widely discussed, but bear with me. Most of the stackoverflow problems dealing with this type of issue concerns with reading a large number of txt…

scala apache-spark hdfs spark-avro

asked Jul 10 '19 at 01:37

ni_i_ru_sama

304
1
13

5

votes

2 answers

Spark 2.4.0 Avro Java - cannot resolve method from_avro

I'm trying to run a spark stream from a kafka queue containing Avro messages. As per https://spark.apache.org/docs/latest/sql-data-sources-avro.html I should be able to use from_avro to convert column value to Dataset. However, I'm unable to…

java scala spark-avro spark-streaming-kafka

asked Mar 06 '19 at 15:25

Maciej C

55
3
6

5

votes

1 answer

How to convert nested avro GenericRecord to Row

I have a code to convert my avro record to Row using function avroToRowConverter() directKafkaStream.foreachRDD(rdd -> { JavaRDD newRDD= rdd.map(x->{ Injection recordInjection =…

java apache-spark avro spark-avro

asked Feb 16 '18 at 13:40

Sumit G

436
8
21

Questions tagged [spark-avro]