0

I am trying to create a file reader object (from readFileSync) and serve the lines from a generator function. My intention is to pass this gnerator object to multiple functions and sequentialy parse a file. However, after using the generator in a single function, the state of the generator shift from suspended to closed. I come from a Python background and this is a very much possible operation in Python. Would like to know what I am doing wrong here. Following is the code I used:

Generator function definition (I am using readFileSync and it is not async, please disregard that for the time being as I am trying to get the generator working):

function* getFileGen(path: string){
  const fileContent = fs
  .readFileSync(path, {
    encoding: "utf-8",
    flag: "r",
  })
  .split("\n");

  while(true){
      const thisLine = fileContent.shift();
      if(!thisLine){
        break;
      }
      yield thisLine; 
  }
}

The two functions in which I would like to use the generator in:

function getFirstFew(stream: Generator){
  let i = 0;
  for(let v of stream){
    console.log(v);
    if(i > 1){
      break;
    }
    i++;
  }
}

function getNextFew(stream: Generator){
  let i = 0;
  for(let v of stream){
    console.log(v);
    if(i > 7){
      break;
    }
    i++;
  }

And finally create a generator and pass it sequentially to two functions that would print a number of lines:

const myStream = getFileGen('path/to/file');

getFirstFew(myStream);
getNextFew(myStream);

The first function executes correctly and prints 3 lines; however by the time the generator is passed to the getNextFew function, it has already closed.

c00der
  • 543
  • 1
  • 4
  • 20

3 Answers3

2

From the docs:

In for...of loops, abrupt iteration termination can be caused by break, throw or return. In these cases, the iterator is closed.

And, specifically:

Do not reuse generators

Generators should not be re-used, even if the for...of loop is terminated early, for example via the break keyword. Upon exiting a loop, the generator is closed and trying to iterate over it again does not yield any further results.

Emphasis mine.

I'll admit though, I'm not very strong with JS, so I can't recommend a comparable workaround. You may need to use a list or another strict structure that you have more control over.

You may be able to implement a tee function comparable to Python's tee that creates a copy of the iterator, then iterate one of the copies.

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
  • Thanks for your answer. Actually, before I looked at your answer I suspected, for of should be the problem and used something like, while(line = this.stream.next().value){ ... //line} which worked without terminating the the generator. – c00der Mar 23 '21 at 01:50
  • 1
    @c00der Using that idea, you could create a `take_n` function that pulls `n` objects from an iterator and returns them. If I were in your place, I'd just whip up some helpers. That actually sounds like chill afternoon mini-project. – Carcigenicate Mar 23 '21 at 01:58
  • That would be a cool idea. Will give it a shot. Actually, in the version of it I am implementing, only a few rows of the file is handled by one function and the rest of it by another. – c00der Mar 23 '21 at 02:02
0

This is a great application for Node's streams API.

You can use a generator as a source for a Node.js ReadableStream, then duplicate it using PassThrough.

A bit like this answer here: Node.js Piping the same readable stream into multiple (writable) targets

Basically you can do this:

const { PassThrough, Readable } = require("stream");

function* getFileGen(path: string) {
  const fileContent = fs
  .readFileSync(path, {
    encoding: "utf-8",
    flag: "r",
  })
  .split("\n");

  while(true) {
      const thisLine = fileContent.shift();
      if(!thisLine) {
        break;
      }
      yield thisLine; 
  }
}

// Set up per: https://nodejs.org/api/stream.html#streamreadablefromiterable-options
const fileReadlineStream = Readable.from(getFileGen(), { objectMode: false });

const firstFew = new PassThrough();
const nextFew = new PassThrough();

fileReadlineStream.pipe(firstFew);
fileReadlineStream.pipe(nextFew);

firstFew.on("data", (d) => {
    console.log("First few received:", d);
});

nextFew.on("data", (d) => {
    console.log("Next few received:", d);
});

If you want to transform that file in some way you will be best off using a TransformStream. In execution you might want to prioritise one stream's processing over another; I'm not an expert on the exact behaviour but generally you can handle that with buffering.

Will Morgan
  • 4,470
  • 5
  • 29
  • 42
0

A for of loop always closes the iterator by calling .return() on it in the end (if the method exists). You can prevent that by removing (overwriting) the method

const myStream = getFileGen('path/to/file');
myStream.return = undefined;
getFirstFew(myStream);
getNextFew(myStream);

but that's crude. I'd rather do something like

function keptOpen(iterable) {
    const iterator = iterable[Symbol.iterator]()
    return {
        [Symbol.iterator]() { return this; },
        next(v) { return iterator.next(); },
    };
}

and use it like

const myStream = keptOpen(getFileGen('path/to/file'));
getFirstFew(myStream);
getNextFew(myStream);

However, notice that this will work only with generator functions that don't care about being kept open. If they expect that their .return() method is called so that they can dispose their allocated resources, keptOpen will make them leak.

Bergi
  • 630,263
  • 148
  • 957
  • 1,375