Questions tagged [hadoop-lzo]

Hadoop-LZO is a project to bring splittable LZO compression to Hadoop.

Hadoop-LZO is a project to bring splittable LZO compression to Hadoop. LZO is an ideal compression format for Hadoop due to its combination of speed and compression size. However, LZO files are not natively splittable, meaning the parallelism that is the core of Hadoop is gone. This project re-enables that parallelism with LZO compressed files, and also comes with standard utilities (input/output streams, etc) for working with LZO files.

21 questions

votes

4 answers

Class com.hadoop.compression.lzo.LzoCodec not found for Spark on CDH 5?

I have been working on this problem for two days and still have not find the way. Problem: Our Spark installed via newest CDH 5 always complains about the lost of LzoCodec class, even after I install the HADOOP_LZO through Parcels in cloudera…

apache-spark cloudera-cdh hadoop-lzo

asked May 03 '14 at 06:37

caesar0301

1,913
2
22
24

votes

1 answer

Read uncompressed thrift files in spark

I'm trying to get spark to read uncompressed thrift files from s3. So far it has not been working. data is loaded in s3 as uncompressed thrift files. The source is AWS Kinesis Firehose. I have a tool that deserializes files with no problem, so I…

apache-spark thrift hadoop-lzo

asked Jun 04 '16 at 21:11

Martin Klosi

3,098
4
32
39

votes

1 answer

Trying to use LZO Compression with MapReduce

I want to use LZO compression in MapReduce, but am getting an error when I run my MapReduce job. I am using Ubuntu with a Java program. I am only trying to run this on my local machine. My initial error is ERROR lzo.GPLNativeCodeLoader: Could not…

java hadoop mapreduce compression hadoop-lzo

asked Aug 20 '15 at 16:00

Matt Cremeens

4,951
7
38
67

votes

1 answer

How does file compression format affect my spark processing

I am confused in understanding the splittable and non splittable file format in big data world . I was using zip file format and i understood that zip file are non splittable in a way that when i processed that file i had to use ZipFileInputFormat…

hadoop apache-spark zip bzip2 hadoop-lzo

asked Feb 22 '18 at 18:58

user9175539

votes

1 answer

Why does my LZO indexing take so long on Amazon's EMR when reading from S3?

I have a 30gb lzo file on S3, and I'm using hadoop-lzo to index it with Amazon EMR (AMI v2.4.2), using region us-east1. elastic-mapreduce --create --enable-debugging \ --ami-version "latest" \ --log-uri s3n://mybucket/mylogs \ --name…

amazon-web-services amazon-s3 amazon-emr lzo hadoop-lzo

asked Jan 01 '14 at 04:15

Dolan Antenucci

15,432
17
74
100

vote

0 answers

Prepraing lzo or lz4 files for Spark

I'm trying to choose the right format for file exchange with my spark application. I use Spark 2.4.7 + Haddop 2.10 on Kubernetess. My app downloads CSV file from S3 and process it. The file is provided by a 3rd party company. I was thinking about…

apache-spark hadoop lz4 lzo hadoop-lzo

asked May 06 '21 at 15:29

Matzz

vote

1 answer

native-lzo not available error | Windows 10 | Java

Exception in thread "main" java.lang.reflect.InvocationTargetException at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at…

java windows parquet hadoop-lzo

asked May 28 '20 at 11:06

Santosh Satvik L

vote

1 answer

Compression codec com.hadoop.compression.lzo.LzoCodec was not found

Trying to run a mapreduce job with compression hadoop jar \ /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \ randomtextwriter \ -Ddfs.replication=1 -Dmapreduce.output.fileoutputformat.compress=true…

hadoop mapreduce hadoop-lzo

asked May 26 '20 at 06:14

Hadoop Developer

vote

1 answer

Java Hadoop-lzo Found interface but class was expected LzoTextInputFormat

I'm trying to use the Hadoop-LZO package (built using the steps here). Seems like everything worked successfully as I was able to convert my lzo files to indexed files via (this returns big_file.lzo.index as expected): hadoop jar…

java hadoop ant hadoop-lzo

asked Dec 09 '16 at 16:21

Sal

1,653
6
23
36

vote

0 answers

Reading Avro container files in Spark

I am working on a scenario where I need to read Avro container files from HDFS and do analysis using Spark. Input Files Directory: hdfs:///user/learner/20151223/.lzo* Note : The Input Avro Files are lzo compressed. val df =…

scala hadoop apache-spark hadoop-lzo

asked Jan 10 '16 at 17:15

Govind

vote

1 answer

lzo codec difference b/w python and java

I am running into a strange problem failing to inflate/uncompress lzo compressed data in java which was deflated/compressed from python lzo module although both seem to be using the same native lzo codec implementation. To give more details, I am…

java hadoop compression lzo hadoop-lzo

asked May 13 '14 at 05:50

user352951

votes

1 answer

How to decompress LZO file using java (using library lzo-core)

I am getting the issue while trying to decompress the LZO file using java. Below is the code and error I have pasted, can someone please help me on this import org.anarres.lzo.*; import java.io.*; public class…

java lzo hadoop-lzo

asked Sep 01 '22 at 08:03

Pritam007

votes

0 answers

Hive cannot find LZO codec

Error occurred when execute select * from xxx: Failed with exception java.io.IOException:java.io.IOException: No LZO codec found, cannot run. Troubleshooting done: Checked hadoop-lzo.jar located in $HADOOP_HOME/share/hadoop/common for all hadoop…

hadoop hive hadoop-lzo

asked May 16 '21 at 16:32

Steven

votes

1 answer

Reading LZO file of json lines in Spark DataFrames

I have a large indexed lzo file in HDFS that I would like to read in spark dataframes. The file contains lines of json documents. posts_dir='/data/2016/01' posts_dir has the following: /data/2016/01/posts.lzo /data/2016/01/posts.lzo.index The…

apache-spark apache-spark-sql hadoop-lzo

asked Jan 27 '17 at 03:00

Majid Alfifi

votes

1 answer

Hadoop lzo single split after index

I have a LZO compressed file /data/mydata.lzo and want to run this though some MapReduce code I have. I first create an index file using the hadoop-lzo package with the following command: >> hadoop jar hadoop-lzo-0.4.21.jar \ …

hadoop mapreduce lzo hadoop-lzo

asked Jan 04 '17 at 17:30

Sal

1,653
6
23
36

2 Next