I am trying to use file as my kafka producer. The source file grows continuously (say 20 records/lines per second). Below is a post similar to my problem:
How to write a file to Kafka Producer
But in this case, the whole file is read and added to the Kafka topic every time a new line is inserted into the file. I want only the newly appended lines to be sent to the topic (ie. if the file holds 10 lines already and 4 more lines are appended to it, only those 4 lines need to be sent to the topic).
Is there a way to achieve this ??
Other solutions tried:
Apache flume by using source type as 'spooldir'. But it was of no use since it reads data from new files that are added to the directory and not when data is appended to an already-read file.
Also we tried with flume source type as 'exec' and command as 'tail –F /path/file-name'. This too doesn't seem to work.
Suggestions for using any other tool is also welcomed as my objective is to read the data from the file in real time (ie. I need the data as soon as it is inserted into the file).