I perform simple parsing of large files. I'm trying to select blocks from a large text file and write these blocks into a new text file. My current method works very slowly because parsing files include more 3 million strings. For example: file for parsing:
1test
1111
2222
3333
4444
1test
5555
6666
2test
5555
4444
3test
0000
4test
9999
0000
5test
3333
3333
8test
2222
9test
6666
11test
1111
I want et next data in new file:
1test
1111
2222
3333
4444
1test
5555
6666
2test
5555
4444
3test
0000
4test
9999
0000
5test
3333
3333
In shortly, I'm trying to select specific blocks from the source file.
My Code:
arr = []
data = File.read("/path/to/file")
blocks = ['1test','2test','3test','4test','5test']
blocks.each do |block|
want = data.match(/#{block}(.*)#{block}/m)[0]
want.each_line do |line|
arr << line
File.open("/path/to/result/file", 'w') { |file| file.write("#{res.join}") }
end
end
I think that my problem is that I read the data "want" many times. Is there a way to write to the result file in one pass of the "want" data?