309

I am encoding a string that will be passed in a URL (via GET). But if I use escape, encodeURI or encodeURIComponent, & will be replaced with %26amp%3B, but I want it to be replaced with %26. What am I doing wrong?

Sebastian Simon
  • 18,263
  • 7
  • 55
  • 75
dododedodonl
  • 4,585
  • 6
  • 30
  • 43
  • 1
    Where does the string come from? Can you post the code you have so far? – Andy E Aug 22 '10 at 13:56
  • 1
    `&` is the proper way to escape the ampersand in an HTML context...where is your source coming from? and what's the destination? It may be better to do this server-side for example. – Nick Craver Aug 22 '10 at 13:59
  • I grap something from the HTML body (and that is HTML encoded (so, there is & I realize now)) and I have to pass it in an URL... So, I need to decode the html (but how?) en then encode the string (with encodeURIComponent)... – dododedodonl Aug 22 '10 at 14:04
  • 2
    found it... I used in jquery .html(), not .text()... stupid (A) – dododedodonl Aug 22 '10 at 14:07
  • 1
    jQuery's *.html()* maps to the *innerHTML* property, so the issue is as I said in my answer :-) – Andy E Aug 22 '10 at 14:19

4 Answers4

448

Without seeing your code, it's hard to answer other than a stab in the dark. I would guess that the string you're passing to encodeURIComponent(), which is the correct method to use, is coming from the result of accessing the innerHTML property. The solution is to get the innerText/textContent property value instead:

var str, 
    el = document.getElementById("myUrl");

if ("textContent" in el)
    str = encodeURIComponent(el.textContent);
else
    str = encodeURIComponent(el.innerText);

If that isn't the case, you can usethe replace() method to replace the HTML entity:

encodeURIComponent(str.replace(/&/g, "&"));
BuZZ-dEE
  • 6,075
  • 12
  • 66
  • 96
Andy E
  • 338,112
  • 86
  • 474
  • 445
103

If you did literally this:

encodeURIComponent('&')

Then the result is %26, you can test it here. Make sure the string you are encoding is just & and not & to begin with...otherwise it is encoding correctly, which is likely the case. If you need a different result for some reason, you can do a .replace(/&/g,'&') before the encoding.

Nick Craver
  • 623,446
  • 136
  • 1,297
  • 1,155
8

There is HTML and URI encodings. & is & encoded in HTML while %26 is & in URI encoding.

So before URI encoding your string you might want to HTML decode and then URI encode it :)

var div = document.createElement('div');
div.innerHTML = '&AndOtherHTMLEncodedStuff';
    
var htmlDecoded = div.firstChild.nodeValue;
console.log('htmlDecoded: '+htmlDecoded);
    
var urlEncoded = encodeURIComponent(htmlDecoded);
console.log('urlEncoded: '+urlEncoded);

result %26AndOtherHTMLEncodedStuff

Hope this saves you some time

Matas Vaitkevicius
  • 58,075
  • 31
  • 238
  • 265
-1

Just to be clear, you should never be using encodeURI() and encodeURIComponent(). If you disagree, just look at its results...

console.log(encodeURIComponent('@#$%^&*'));

Input: ^&*.

Output: %40%23%24%25%5E%26*.

That's not right, is it? * did not get converted! I hope you're not using this as a server-side cleansing function, because * will not be treated as input but as commands, i.e., imagine deleting a user's alleged file with rm *. Well, I hope you're not using encodeURI() or encodeURIComponent()!

TLDR: You actually want fixedEncodeURIComponent() and fixedEncodeURI().

MDN encodeURI() Documentation...

function fixedEncodeURI(str) {
   return encodeURI(str).replace(/%5B/g, '[').replace(/%5D/g, ']');
}

MDN encodeURIComponent() Documentation...

function fixedEncodeURIComponent(str) {
 return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
   return '%' + c.charCodeAt(0).toString(16);
 });
}

With these functions, use fixedEncodeURI() to encode a single URL piece, whereas fixedEncodeURIComponent() will encode URL pieces and connectors; or, more simply, fixedEncodeURI() will not encode +@?=:#;,$& (as & and + are common URL operators), but fixedEncodeURIComponent() will.

Mathieu
  • 58
  • 8
HoldOffHunger
  • 18,769
  • 10
  • 104
  • 133
  • 2
    Where did you get the idea that `encodeURIComponent` was supposed to “cleanse” anything? It encodes text to preserve its meaning in a URI. It’s not for passing user input directly to `system("rm")`. What? – Ry- Feb 10 '22 at 18:13
  • You can actually use the results of these functions with anything. That's just an example. You can run this server-side with NodeJS. And then run server-side commands with URL inputs. It's designed this way, no? – HoldOffHunger Feb 10 '22 at 18:15
  • If you wanted to assemble user input into a shell command and not have globs expanded, the function to use would be an `escapeshellarg`-style one. Also, you should literally never assemble user input into a shell command. – Ry- Feb 10 '22 at 18:21
  • That was just an example. You may need to run user-input as args somewhere. This is the proper way to do the escaping with JS. – HoldOffHunger Feb 10 '22 at 18:21
  • No, it’s really not. This is a dangerous misunderstanding of the purpose of encoding. – Ry- Feb 10 '22 at 18:23
  • How about I completely remove anything shell-related in my example? And instead do a simply forloop over chars available in js, showing that I can't use encodeURI() to access the `%2a.html` file? – HoldOffHunger Feb 10 '22 at 18:29