I'm trying to programmatically supply the contents of an in-memory file to pdftotext, which is freely available: http://www.xpdfreader.com/about.html Others seem to have done that: Passing string stored in memory to pdftotext, antiword, catdoc, etc
I can programmatically handle the output:
const { spawn } = require( 'child_process' );
const fs = require( 'fs' );
const pdftotextExe = 'bin/bin64/pdftotext.exe';
const inputFile = 'simple.pdf';
const command = spawn( pdftotextExe, [inputFile, '-'], { stdio: ['pipe', 'pipe', 'pipe'] } );
command.stdout.on( 'data', chunk => console.log( `starting chars: ${chunk.toString( 'utf8' ).slice( 0, 5 )}` ) );
command.stdout.on( 'end', () => console.log( 'done run' ) );
command.stderr.on( 'data', ( err => console.log( `got error: >> ${err} <<` ) ) );
The above works. But trying to supply the input through stdin doesn't work. The similar code below produces "TypeError [ERR_STREAM_NULL_VALUES]: May not write null values to stream"
const command = spawn( pdftotextExe, ['-', '-'], { stdio: ['pipe', 'pipe', 'pipe'] } );
fs.readFile( inputFile, {}, ( contents ) => {
command.stdin.write( contents ); // write file contents
command.stdin.end(); // end input
command.stdout.on( 'data', chunk => console.log( `starting chars: ${chunk.toString( 'utf8' ).slice( 0, 5 )}` ) );
command.stdout.on( 'end', () => console.log( 'done run' ) );
command.stderr.on( 'data', ( err => console.log( `got error: >> ${err} <<` ) ) );
} );
What's wrong with the above code?