3

I have an image encoded as a base64 String and I'm trying to POST it as a parameter to another REST API (http://ocrapiservice.com/documentation/).

I don't want to have to store the file on disk - I want to keep the file in memory because eventually I want to create the image using getUserMedia on the client and I'll likely use a hosted service that doesn't allow direct file IO.

The problem I have is that most examples I can find post images from disk using fs.createReadStream(somePath); e.g. https://github.com/mikeal/request/blob/master/README.md#forms

I'd prefer to use a library such as the request library but maybe that won't be possible.

The code I have right now is:

var fs = require( 'fs' );
var path = require( 'path' );

var request = require( 'request' );
var WS_URL = 'http://api.ocrapiservice.com/1.0/rest/ocr';

function ocr( postData ) {
    var r = request.post(WS_URL, completed);
    var form = r.form();

    form.append('apikey', 'REMOVED');
    form.append('language', 'en' );
    form.append('image', postData );

    function completed(error, response, body) {
        console.log( body );
    }
}

// This works
ocr( fs.createReadStream(path.join(__dirname, 'example.png' ) ) );

// This does not work
// ignore that it's being read from a file (the next line)
var base64Str = fs.readFileSync( 'example.png' ).toString('base64'); 

var buffer = new Buffer(base64Str, 'base64');
ocr( buffer.toString( 'binary' ) );

form.append does allow additional parameter to be sent so if additional headers need to be set then that is possible.

Is there a Stream wrapper of some kind that I can use? I've tried using this StringReader example and it is possible to modify it to at least send a filename and the correct Content-Type.

How do I achieve this posting of an in memory file as a parameter to a web service?

Update:

I've fixed/updated the code above.

The response I get from the REST API listed above is:

HTTP/1.1 400 Bad Request

No file provided

Here's the actual code that I'm running: https://gist.github.com/leggetter/4968764

leggetter
  • 15,248
  • 1
  • 55
  • 61

3 Answers3

0

In the stream version you are working with pieces of the file, but in the base64Image version you are working with a base64 string. Since the stream version works, the ocr API obviously expects the form to simply contain binary data, so you need to decode the base64 data before sending it.

// Reading straight from a Buffer.
var imageData = fs.readFileSync('example.png');
ocr( imageData );

// Reading from a new Buffer created from a base64 string.
var base64Image = '...';
ocr(new Buffer(base64Image, 'base64'));

Also note, in your example code:

// This line:
var base64Image = new Buffer(imageData, 'binary').toString('base64');

// does the same thing as this, because 'imageData' is alreadya  Buffer
var base64Image = imageData.toString('base64');
loganfsmyth
  • 156,129
  • 30
  • 331
  • 251
0

I am sure you can use Buffer directly for request, without having to wrap it into a Stream.

Its not working because you are using incorrect encoding. readFileSync returns Buffer by default, if encoding is absent. Buffer uses utf-8 by default as its encoding. But you have used the encoding as binary in between, which is not consistent with what you have in the buffer.

var imageData = fs.readFileSync('example.png');//After you get imageData 
var base64Image = imageData.toString('base64');    //base64 encoded string
var decodedImage = new Buffer(base64Image, 'base64');  //base64 encoded buffer
ocr (imageData);  //You can use the file directly
ocr (base64Image); //Or you can use either the base-64 string or the base64-buffer

Look here for more encoding/decoding details NodeJS base64 image encoding/decoding not quite working

Community
  • 1
  • 1
user568109
  • 47,225
  • 17
  • 99
  • 123
0

You should be able to use the base64-image-upload NPM package I whipped up, as I was having the same issue.

What this does is exactly the approach outlined by user568109, creating a buffer from the base64 string and POSTing it. The important part is that it abstracts away dealing with encodings, MIME types, and request headers, and it would greatly simplify your code.