2

I'm working on a JavaScript program that's parsing data from an API. Part of the data is coming back in the format '\t' etc and I need to parse it as tab (in this example) not '\t'.

In PHP, there's a wonderful function called stripcslashes() which does this for me, but I can't find a JavaScript equivalent.

Does anyone know of such a function in JavaScript? Or a way of parsing '\t' as a tab?

Lee Taylor
  • 7,761
  • 16
  • 33
  • 49
MrSimonEmms
  • 1,321
  • 2
  • 16
  • 32
  • 1
    `replace(/[\s\\]+/g,'')` will do the trick? – alexbusu Feb 14 '13 at 15:53
  • It doesn't. That just strips the slashes out - I want to parse the \t as a TAB character – MrSimonEmms Feb 14 '13 at 15:57
  • how about `replace(/([\\]{2})/g,'\')` ? :) – alexbusu Feb 14 '13 at 16:20
  • @AlexanderV.B.: that's not going to work, because the string has already been parsed, which means your replace won't find a match. The string expressed as `\\t` is stored internally as two characters: "\" and "t". – Martijn Feb 14 '13 at 16:43
  • I understand, but with the example above I tried to replace "\\" with "\", so "\\t" will become "\t"... – alexbusu Feb 14 '13 at 16:47
  • @AlexanderV.B.: apparently you _don’t_ understand: the conversion from `\t` to TAB only occurs when parsing a string literal. You want to remove backslashes **after** the string has been parsed, when it won't be parsed again... – Martijn Feb 14 '13 at 16:54

2 Answers2

2

Javascript's string literals use those same backslash escapes. There should be a way to make use of that...

var escapedString = '\\t' // actually contains "\t"
,   renderedString = (new Function('return "' + text + '"'))()
;
console.log(renderedString); // "   " <= this is the tab character

Note that this leaves your code wide open for injection attacks, and is very unsafe, especially if the escapedString is coming from an external source!

It’s (almost) as bad as using eval!

Update: the safe way

A better way would probably be using a regular expression:

var escapedString = '\\t' // actually contains the characters \ and t
,   renderedString = escapedString.replace(/\\(.)/g, function (match, char) {
        return {'t': '\t', 'r': '\r', 'n': '\n', '0': '\0'}[char] || match;
    }
;
console.log(renderedString); // "   " <= this is the tab character

Here you’d be using the replace method to track down all the backslashes (followed by any character) in the given string, and call the anonymous function every time a match is found. The function's returned value will determine what that match will be replaced by.

The anonymous function contains a JS object (in this case used as an associative array), which contains all the recognized escape characters. The character following the backslash (here called char), will be looked up in that associative array, and JS will return the corresponding value.

The final part of the line, || char, ensures that if the char is not part of the associative array, the unchanged match is used (which includes the backslash), leaving the original string unchanged.

Of course, this does mean that you have to specify all the possible escapes in advance.

Second update: the way to go

It just occurred to me that the first method is unsafe (there might be fraudulent input), and the second method uncertain (you don't really know what escapes you need to provide for); perhaps both methods could be combined!

var escapedString = '\\t' // actually contains the characters \ and t
,   renderedString = escapedString.replace(/\\./g, function (match) {
        return (new Function('return "' + match + '"'))() || match;
    }
;
console.log(renderedString); // "   " <= this is the tab character

Of course, the performance of this solution won't be very great. Parsing and building that new function is a costly thing, and it needs to be done for each and every matching backslash present in the escaped string. If performance becomes an issue, you could add some kind of caching mechanism, which remembers each function once it's been created.

Then again, you could also look up the API you need to work with, try and find its documentation or contact its makers, and get a definitive list of escapes it can produce. :-0

Community
  • 1
  • 1
Martijn
  • 13,225
  • 3
  • 48
  • 58
  • I was trying to avoid using that regex solution, as I have to provide the list. Hey ho. Thanks – MrSimonEmms Feb 14 '13 at 16:37
  • @RiggerTheGeek: also, do bear in mind that I don’t know if Javascript escapes the *exact same set* as C does (or whatever your API uses) – Martijn Feb 14 '13 at 16:49
  • I'll test it on the API in the morning (written in Java, outputting as JSON for the record) and accept it once I've got it working – MrSimonEmms Feb 14 '13 at 21:55
0

If you are using jQuery the html() and text() functions should do what you want.

var parsed = $("<div/>").html('\thello').text();

http://api.jquery.com/html/
http://api.jquery.com/text/

olan
  • 3,568
  • 5
  • 24
  • 30
  • Not what I'm looking for. I'm not using jQuery (not my choice), but we already have the '\t' parsed as a tab character. I'm getting the \t from the API, which is coming through as the equivalent of `var delimiter = '\\t';` – MrSimonEmms Feb 14 '13 at 16:05