0

There's a few answers about this on StackOverflow, but none of them work for me.

I have mixed data coming from Google mailbox import (some are in utf-8, some are in base64) into my Node.Js app and I need to check for every string if it's in base64 or not.

Does anyone have a solution that works?

Data example

0KHQvdCw0YfQsNC70LAg0YHQvtC30LTQsNC10YLRgdGPINGA0LXRiNC10YLQutCwINC/0YDQvtGB
0YLRgNCw0L3RgdGC0LLQsCDQstC+0L7QsdGA0LDQttC10L3QuNGPLiZuYnNwOzxkaXY+PGJyPjwv
ZGl2PjxkaXY+0JfQsNGC0LXQvCDQvdCwINC90LXQtSDQvdCw0L3QuNC30YvQstCw0LXRgtGB0Y8g
0LLRi9C80YvRgdC10LsuPC9kaXY+PGRpdj48

Code to get it github.com/mscdex/node-imap but I keep only the message text, i.e.

msg.on('body', function(stream, info) { 
  stream.on('data', function(chunk) {
    count += chunk.length;
    buffer += chunk.toString('utf8');
  }
}
Aerodynamika
  • 7,883
  • 16
  • 78
  • 137
  • Look for the mime header - have a look here for example http://stackoverflow.com/questions/9110091/base64-encoded-images-in-email-signatures – mplungjan Nov 04 '14 at 15:22
  • @mplungjan any other suggestions? i can't get that info. – Aerodynamika Nov 04 '14 at 15:46
  • Why not? Are you getting them as separate attachments? – mplungjan Nov 04 '14 at 15:58
  • @mplungjan no, i'm not. – Aerodynamika Nov 04 '14 at 16:01
  • See if the strings are fixed length and have no space in them before you test for B64 compliance – mplungjan Nov 04 '14 at 16:01
  • Please post some example of data. – mplungjan Nov 04 '14 at 16:01
  • 0KHQvdCw0YfQsNC70LAg0YHQvtC30LTQsNC10YLRgdGPINGA0LXRiNC10YLQutCwINC/0YDQvtGB 0YLRgNCw0L3RgdGC0LLQsCDQstC+0L7QsdGA0LDQttC10L3QuNGPLiZuYnNwOzxkaXY+PGJyPjwv ZGl2PjxkaXY+0JfQsNGC0LXQvCDQvdCwINC90LXQtSDQvdCw0L3QuNC30YvQstCw0LXRgtGB0Y8g 0LLRi9C80YvRgdC10LsuPC9kaXY+PGRpdj48 – Aerodynamika Nov 04 '14 at 16:03
  • Where is the data coming from? You're exporting *from* google (e.g. via Google Takeout) or ? – mscdex Nov 04 '14 at 16:25
  • @mscdex from google mail via imap – Aerodynamika Nov 04 '14 at 16:26
  • If you are using IMAP, the server provides you with encoding information. Perhaps you could show the code you're using to transfer the messages? – mscdex Nov 04 '14 at 16:28
  • i'm using the code from here https://github.com/mscdex/node-imap but i keep only the message text, i.e. msg.on('body', function(stream, info) { stream.on('data', function(chunk) { count += chunk.length; buffer += chunk.toString('utf8');}} – Aerodynamika Nov 04 '14 at 16:29
  • If you set `struct: true` in your `fetch()` options, you will receive message structure information too, which includes (among other things) the `encoding` of the various parts of the message. The fetched `struct` is available on the object passed to a message's `attributes` event handler(s). – mscdex Nov 04 '14 at 16:34
  • @mscdex i've done that, but the problem is that the buffer variable is set in msg.on('body') and the attributes are checked after in msg.once('attributes') and they don't see each other's data... – Aerodynamika Nov 04 '14 at 16:56
  • For each message you should be able to have your `buffer` and an `attrs` variable that you set inside your `attributes` event handler. Then on the message's `end` event, both variables (your buffered data and the message attributes) should be set. – mscdex Nov 04 '14 at 17:02
  • @mscdex strange, because they are not being passed on to msg.once('end', function() {}); – Aerodynamika Nov 04 '14 at 17:29
  • They're not passed in the handler, your variables are accessible in that `end` handler because of how scoping works in javascript. – mscdex Nov 04 '14 at 17:30
  • @mscdex so how do i access them? i can't access them from inside that end handler? – Aerodynamika Nov 04 '14 at 17:36
  • Why do I have to update your question with your code - you have 800 rep, you must have been around here before... – mplungjan Nov 04 '14 at 17:43
  • @mplungjan i honestly don't know! – Aerodynamika Nov 04 '14 at 17:47
  • @mplungjan and before marking the post as duplicate maybe you should read it, as it is not at all a duplicate. – Aerodynamika Nov 04 '14 at 17:47
  • Well as initially stated it is an exact duplicate. With the missing information from your comments that you should have added to the question to get it reopened, it now could be "How do I get the meta data from struct messages" which you find under struct here https://github.com/mscdex/node-imap#user-content-data-types – mplungjan Nov 04 '14 at 17:57
  • @mplungjan ok it still doesn't work – Aerodynamika Nov 04 '14 at 18:14
  • Please update your question with what you tried and the result – mplungjan Nov 04 '14 at 18:15
  • @mplungjan i fixed it so will post the right answer, as it was a combination of all the advices above. – Aerodynamika Nov 04 '14 at 19:16

1 Answers1

1

The problem is that data I was receiving via node-imap module of Node.Js contained two different encoding parameters in the params:

attrs.struct[0].params.charset

would be set as utf-8 encoding for all messages

while

attrs.struct[0].encoding

for the messages that were in base64 encoding would be set as BASE64

So I passed on the both parameters in variables to the msg.once('end'), check if the attrs.struct[0].encoding is base64 and apply conversion for such strings from base64 to utf-8:

statement = new Buffer(email, 'base64').toString('utf8');

The rest get controlled through

statement = mimelib.decodeQuotedPrintable(email);
Aerodynamika
  • 7,883
  • 16
  • 78
  • 137