Questions tagged [flume]

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.

Web site: http://flume.apache.org/
Git Repo: https://github.com/apache/flume
User Guide: http://flume.apache.org/FlumeUserGuide.html

1136 questions

votes

7 answers

What's the difference between Flume and Sqoop?

Both Flume and Sqoop are meant for data movement, then what is the difference between them? Under what condition should I use Flume or Sqoop?

hadoop sqoop flume

asked Oct 22 '13 at 15:08

Cacheing

3,431
20
46
65

votes

1 answer

flume vs kafka vs others

May be this question has been asked before but I think it is good to consider it again today given that these technologies have matured. We're looking to use one of flume, kafka, scribe, or others to store streaming facebook and twitter profile…

scribe flume

asked Sep 24 '12 at 05:55

pranavsharma

1,085
2
10
18

votes

3 answers

What is the most mature library for building a Data Analytics Pipeline in Java/Scala for Hadoop?

I found many options recently, and interesting in their comparisons primarely by maturity and stability. Crunch - https://github.com/cloudera/crunch Scrunch - https://github.com/cloudera/crunch/tree/master/scrunch Cascading -…

scala hadoop cascading flume

asked Feb 24 '12 at 08:59

yura

14,489
21
77
126

votes

5 answers

Rebalancing issue while reading messages in Kafka

I am trying to read messages on Kafka topic, but I am unable to read it. The process gets killed after sometime, without reading any messages. Here is the rebalancing error which I get: [2014-03-21 10:10:53,215] ERROR Error processing message,…

message-queue apache-zookeeper flume apache-kafka

asked Mar 21 '14 at 15:50

divinedragon

5,105
13
50
97

votes

2 answers

how to efficiently move data from Kafka to an Impala table?

Here are the steps to the current process: Flafka writes logs to a 'landing zone' on HDFS. A job, scheduled by Oozie, copies complete files from the landing zone to a staging area. The staging data is 'schema-ified' by a Hive table that uses the…

hadoop apache-kafka flume impala

asked Jan 25 '16 at 23:54

Alex Woolford

4,433
11
47
80

votes

6 answers

failing to load log4j2 while running fatjar

i am working on a project where i utilize log4j2 logging. while developing in intellij, all works fine and the logging is done as expected. the log4j2.xml is linked through java property passed to jvm on startup via intellij settings. but once i…

java logging log4j2 flume flume-ng

asked Dec 08 '14 at 15:17

atarno

votes

7 answers

JMeter - Could not find the TestPlan class

I have a simple flume setup with a HTTP souce and a sink that writes the POST request payload to a file. (This complete setup is on a Linux machine). After that my task is to do a performance test on ths setup. So I decided to use JMeter (this is…

linux jmeter flume

asked Sep 12 '13 at 08:02

Himanshu

1,433
4
24
35

votes

3 answers

How to setup a HTTP Source for testing Flume setup?

I am a newbie to Flume and Hadoop. We are developing a BI module where we can store all the logs from different servers in HDFS. For this I am using Flume. I just started trying it out. Succesfully created a node but now I am willing to setup a HTTP…

java hadoop flume

asked Sep 06 '13 at 12:09

Himanshu

1,433
4
24
35

votes

1 answer

How to configure Flume to listen a web api http petitions

I have built an api web application, which is published on IIS Server, I am trying to configure Apache Flume to listen that web api and to save the response of http petitions in HDFS, this is the post method that I need to listen: [HttpPost] …

hadoop asp.net-web-api hdfs flume flume-ng

asked Oct 03 '17 at 14:30

MelgoV

votes

2 answers

Apache Flume vs Apache Flink difference

I need to read a stream of data from some source (in my case it's UDP stream, but it shouldn't matter), transform the each record and write it to the HDFS. Is there any difference between using Flume or Flink for this purpose? I know I can use…

flume apache-flink flume-ng flink-streaming

asked Oct 04 '16 at 16:59

Kateryna Khotkevych

1,248
1
12
22

votes

0 answers

Scribe, Flume and Chukwa - what are alternatives?

I would like to learn about alternatives to those projects, especially designed to aggregate data from logs from multiple nodes (>500) and designed for low memory/cpu usage. I'm familiar with scribe, flume and chukwa and I think that they use too…

logging flume scribe-server chukwa

asked Aug 30 '10 at 14:42

wlk

5,695
6
54
72

votes

3 answers

real time log processing using apache spark streaming

I want to create a system where I can read logs in real time, and use apache spark to process it. I am confused if I should use something like kafka or flume to pass the logs to spark stream or should I pass the logs using sockets. I have gone…

apache-spark apache-kafka flume spark-streaming

asked Feb 22 '15 at 07:03

Y0gesh Gupta

2,184
5
40
56

votes

2 answers

Transferring files from remote node to HDFS with Flume

I have a bunch of binary files compressed into *gz format. These are generated on a remote node and must be transferred to HDFS located one of the datacenter's server. I'm exploring the option of sending the files with Flume; I explore the option…

hadoop hdfs bigdata flume

asked Oct 02 '14 at 20:09

Mister Arduino

votes

5 answers

How to install and configure apache flume?

Am new in the Apache Flume. I need to install the flume on top of the HDFS cluster environment. I did Google it, all are saying using the cloudera distribution but I need to install and configure from the source. So can anyone please suggest me,…

flume

asked Jan 05 '13 at 09:03

venkat

votes

4 answers

Retrieving timestamp from hbase row

Using Hbase API (Get/Put) or HBQL API, is it possible to retrieve timestamp of a particular column?

java hbase flume

asked Nov 30 '11 at 05:58

Abhijeet Pathak

1,948
3
20
28

2 3

…

75 76 Next