1

I'm trying to write log files to hdfs which is in EMR using but I'm facing an error.

I have flume(1.6.0 version) on machine X and another flume running on machine Y which have AWS and I want to populate my log files (to HDFS which in AWS(EMR)) from machine X to machine Y while running agent encounters error in machine Y.

My machine X config:

agent.sources = localsource
agent.channels = memoryChannel
agent.sinks = avro_Sink
agent.sources.localsource.type = spooldir
agent.sources.localsource.spoolDir = /home/dwh/teja/Flumedata/
agent.sources.localsource.fileHeader = true
agent.sources.localsource.channels = memoryChannel
agent.sinks.avro_Sink.channel = memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 10000
agent.channels.memoryChannel.transactionCapacity = 1000
agent.sinks.avro_Sink.type = avro
agent.sinks.avro_Sink.hostname= ec2-serverid.compute-1.amazonaws.com
agent.sinks.avro_Sink.port= 8021
agent.sinks.avro_Sink.avro.batchSize = 100
agent.sinks.avro_Sink.avro.rollCount = 0
agent.sinks.avro_Sink.avro.rollSize = 73060835
agent.sinks.avro_Sink.avro.rollInterval = 0
agent.sources.localsource.interceptors = search-replace regex-filter1
agent.sources.localsource.interceptors.search-replace.type = search_replace
agent.sources.localsource.interceptors.search-replace.searchPattern = ###|##
agent.sources.localsource.interceptors.search-replace.replaceString = | my machineY config:
tier1.sources  = source1
tier1.channels = channel1
tier1.sinks    = sink1
tier1.sources.source1.type = avro
tier1.sources.source1.bind=serverid
tier1.sources.source1.port = 8021
tier1.sources.source1.channels = channel1
tier1.channels.channel1.type= memory
tier1.sinks.sink1.type     = hdfs
tier1.sinks.sink1.channel    = channel1
tier1.sinks.sink1.hdfs.path = hdfs://serverid:8020/user/hadoop/flumelogs/
tier1.sinks.sink1.hdfs.filePrefix = Flumedata
tier1.sinks.sink1.hdfs.fileType = DataStream
tier1.sinks.sink1.hdfs.writeFormat= Text
tier1.sinks.sink1.hdfs.batchSize = 10000
tier1.sinks.sink1.hdfs.rollCount = 0
tier1.sinks.sink1.hdfs.rollSize = 73060835
tier1.sinks.sink1.hdfs.rollInterval = 0
tier1.channels.channel1.capacity = 10000
tier1.channels.channel1.transactionCapacity = 1000

ERROR log:

2016-06-08 15:19:01,635 (SinkRunner-PollingRunner-DefaultSinkProcessor) [WARN - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:455)] HDFS IO error
org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at com.sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:243)
    at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:235)
    at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:679)
    at org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
    at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:676)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

If anyone is familiar with this please help.

J. Chomel
  • 8,193
  • 15
  • 41
  • 69
  • Hi Teja, welcome to SO. Are you using your own code and get this error? If it's the case, then you must post the relevant parts. Otherwise, detail the steps that led you to this error. – J. Chomel Jun 09 '16 at 06:19
  • Hi Chomel, As You asked i posted my config files. – Teja Kumarreddy Jun 09 '16 at 07:05
  • This is usually an Hadoop version mismatch. See http://stackoverflow.com/questions/31453336/exception-in-thread-main-org-apache-hadoop-ipc-remoteexception-server-ipc-ver or http://stackoverflow.com/questions/23634985/error-when-trying-to-write-to-hdfs-server-ipc-version-9-cannot-communicate-with. – Binary Nerd Jun 09 '16 at 07:49
  • Hi Binary Nerd , Thanks alot for your reply, I saw above link previously but not got clarity with that where this POM file locates and as explained which versions are mis matching. – Teja Kumarreddy Jun 09 '16 at 08:06
  • Hi Binary Nerd as you said i changed pom file but still facing same issue – Teja Kumarreddy Jun 09 '16 at 09:15
  • Teja, for Binary Nerd to get notified, you need to prefix its name with `@`: @Binary Nerd – J. Chomel Jun 09 '16 at 09:29
  • Does this one help: http://stackoverflow.com/questions/35173503/hdfs-io-error-org-apache-hadoop-ipc-remoteexception-server-ipc-version-9-cannot (seems to match your situation pretty well) – Binary Nerd Jun 09 '16 at 10:32
  • Of course i tried that too, in that link as he mentioned i add all that jars by getting same error :@Binary Nerd – Teja Kumarreddy Jun 09 '16 at 10:54

0 Answers0