2
require 'net/ftp'
require 'nokogiri'

server = "xxxxxx"
user = "xxxxx"
password = "xxxxx"

ftp = Net::FTP.new(server, user, password)

files = ftp.nlst('File*.xml')

files.each do |file|
   ftp.getbinaryfile(file)
   doc = Nokogiri::XML(open(file))
   # some operations with doc
end

With the code above I'm able to parse/read XML file, because it first downloads a file.

But how can I parse remote XML file without downloading it?

The code above is a part of rake task that loads rails environment when run.


UPDATE:

I'm not going to create any file. I will import info into the mongodb using mongoid.

Askar
  • 5,784
  • 10
  • 53
  • 96
  • Do mean without creating a local file? I.e. would copying (via FTP) file contents into local memory be acceptable? – Neil Slater Jul 03 '13 at 09:28
  • No, I won't create a local file. I will parse it and import into the mongodb with mongoid. I will update my post with this info. – Askar Jul 03 '13 at 09:29
  • Would copying the XML source into memory, using FTP download, but without writing anything to your local file system, be acceptable? This is an important distinction - parsing the content without fetching it some way via FTP will require you to run a process on your FTP server, which is possible, but much more complicated. – Neil Slater Jul 03 '13 at 09:31

1 Answers1

7

If you simply want to avoid using a temporary local file, it is possible to to fetch the file contents direct as a String, and process in memory, by supplying nil as the local file name:

files.each do |file|
   xml_string = ftp.getbinaryfile( file, nil )
   doc = Nokogiri::XML( xml_string )
   # some operations with doc
end

This still does an FTP fetch of the contents, and XML parsing happens at the client.

It is not really possible to avoid fetching the data in some form or other, and if FTP is the only protocol you have available, then that means copying data over the network using an FTP get. However, it is possible, but far more complicated, to add capabilities to your FTP (or other net-based) server, and return the data in some other form. That could include Nokogiri parsing done remotely on the server, but you'd still need to serialise the end result, fetch it and deserialise it.

Neil Slater
  • 26,512
  • 6
  • 76
  • 94