4

Here is the scenario I'm trying to implement:

Remote to an FTP Server Copy a large file (3gig+ in size) to a local folder Stream the local file into a Camel Processor, batching the file 100 lines at a time. Write the Batched set of lines out to a Kafka topic.

Now I've got the first part figured out. I'm able to read the file in to a local directory. The problem is, how do I kick of the second Route (Streaming the Local file to Kafka)? Is there a way to chain all of these tasks together in the same route, or should I have multiple routes:

1 for the FTP -> LOCAL FILE and then 1 for the LOCAL FILE -> KAFKA

If I need two routes, then what's the best way to kick off the second route after the first route is done.

Thanks for any assistance. Additionally, here is the FTP portion that already works.

public void configure() throws Exception {
    from(fullyBuiltFtpPath)
            .routeId("FTP ENDPOINT CONSUMER" + UUID.randomUUID().toString())
            .process(new FtpToLocalFileProcessor())
            .to("file:c:\\temp")
            .log(LoggingLevel.INFO, "FILENAME: ${header.CamelFileName}").end();
}
Kenster
  • 23,465
  • 21
  • 80
  • 106
h0mer
  • 353
  • 1
  • 4
  • 10
  • You can use the [file component](http://camel.apache.org/file2.html) to monitor the directory where you save the 100 line batches in files. But why don't you send the batches straight to Kafka? Do you need to do something else with the files? – Ralf Apr 15 '16 at 07:31
  • That is another option, but I'm unaware of how to batch the files directly from the FTP route. Is there a component that would let me stream it in after the FTP route has copied the file to a localTempDirectory? – h0mer Apr 16 '16 at 16:51

2 Answers2

0

It's not incorrect to produce file in a folder and consume it at the same time in Linux environment, but this depends on the environment. However, camel provides an useful mechanism which is the "doneFileName" which can be specified in consumer and producer. More details here : http://camel.apache.org/file2.html

You can find more details at the section "Consuming files from folders where others drop files directly".

Jonathan Schoreels
  • 1,660
  • 1
  • 12
  • 20
0

I ended up splitting the routes into two distinct routes:

1.) Retrieve File from FTP Server and store it in a local temp Directory 2.) Start up a file route to listen to the local temp directory and consume the file.

This isn't ideal, but it works for now. Thanks for the help.

h0mer
  • 353
  • 1
  • 4
  • 10
  • how do we deal with situations where route 2 takes a long time to process? I don't want route 1 to fill up the filesystem. – dustmachine Nov 10 '16 at 14:45
  • In my case I did a file lock on route 1 (retreive from FTP), and then when the lock was released, Route 2 would pick up the file and continue it's processing. – h0mer Nov 16 '16 at 21:06