I have a function which will receive a Stream<String>
. This stream represents the lines in a file (as called by Files.lines(somePath)
). The file itself is actually the concatenation of many files into a single file, something like this:
__HEADER__ # for file 1 data more data ... __HEADER__ # file 2 starts here some more data... ...
I need to convert the stream into multiple physical files on the filesystem.
I've tried the simple approach, something along the lines of:
String allLinesJoined = lineStream.collect(Collectors.joining());
// This solution seems to get stuck on the line above ^
String files[] = allLinesJoined.split("__HEADER__");
for (fileStr : files)
{
// This function will write each fileStr to a separate file
// (filename is determined by contents of fileStr)
writeToPhysicalFile(fileStr);
}
But the input file is about ~300 MB (and could get larger) and this solution seems to get stuck on the first line. Maybe it would complete if I had more memory...?
Is there a better way to do this, if my starting point is a Stream<String>
, or should I start making other changes so that this bit of code can just read through the file line by line, without using the streaming API?
(the order of the lines does matter, in the context of these files)
tl;dr
I need to turn one big file represented as Stream<String>
in to many little files. Each little file begins with __HEADER__
and all lines after, until the next __HEADER__
. The current library uses streams to provide the file, but is it even worth trying to do this with streams, or will my life be easier if I change the library to offer non-stream functionality?