I have a 50+ GB XML file which I initially tried to (man)handle with Nokogiri :)
Got killed: 9
- obviously :)
Now I'm into muddy Ruby threaded waters with this stab (at it):
#!/usr/bin/env ruby
def add_vehicle index, str
IO.write "ess_#{index}.xml", str
#file_name = "ess_#{index}.xml"
#fd = File.new file_name, "w"
#fd.write str
#fd.close
#puts file_name
end
begin
record = []
threads = []
counter = 1
file = File.new("../ess2.xml", "r")
while (line = file.gets)
case line
when /<ns:Statistik/
record = []
record << line
when /<\/ns:Statistik/
record << line
puts "file - %s" % counter
threads << Thread.new { add_vehicle counter, record.join }
counter += 1
else
record << line
end
end
file.close
threads.each { |thr| thr.join }
rescue => err
puts "Exception: #{err}"
err
end
Somehow this code 'skips' one or two files when writing the result files - hmmm!?