4

Using HTML5 (or less preferably JavaScript) is it possible to limit the maximum length of an input to a particular number of bytes?

I realise that I can limit to a number of characters with:

<input type="text" maxlength="4" />

But that's not good enough because I can input up to four two-byte chars in it.

Obviously I am validating this server-side, but I would like this on the browser-side too.

Edit: Just to be clear, I do wish to be able to support UTF-8. Sorry @elclanrs.

meshy
  • 8,470
  • 9
  • 51
  • 73
  • 1
    clientside+serverside ajax? – Alessandro Gabrielli Jun 05 '13 at 21:02
  • Or maybe just limit the characters on `keydown` with something like `/[\w\s]+/` which will be all 1byte. – elclanrs Jun 05 '13 at 21:05
  • The user can input up to four characters, each of which can be up to four bytes long when UTF-8 is used. I don’t see the point of this question. Why would you limit input to a certain amount of bytes? The data will be processed as *characters* anyway. – Jukka K. Korpela Jun 06 '13 at 03:58
  • @JukkaK.Korpela Thank you for reminding me that non-ASCII chars can be more than 4 bytes. The data *will not be processed as characters* on the server. It is an unavoidable limitation that I must use a particular number of bytes. – meshy Jun 06 '13 at 12:56
  • @AlessandroGabrielli I'm not really enamoured with that solution, as I wish to keep traffic to a minimum, but it may come to that. – meshy Jun 06 '13 at 12:57

4 Answers4

1

this script has a couple minor UX glitches that can be cleaned up, but it does accomplish the basic task outlined when i tested it in chrome:

<input id=myinp />


<script> // bind handlers to input:
   myinp.onkeypress=myinp.onblur=myinp.onpaste= function vld(e){
     var inp=e.target;
     // count bytes used in text:
     if( encodeURIComponent(inp.value).replace(/%[A-F\d]{2,6}/g, 'U').length > 4){
        // if too many bytes, try to reject:
        e.preventDefault;
        inp.value=inp.val||inp.value;
        return false;
     }
     // backup last known good value:
    inp.val=inp.value;
   }

</script>
dandavis
  • 16,370
  • 5
  • 40
  • 36
  • This looks great, thank you! According to this: http://stackoverflow.com/questions/5290182/how-many-bytes-takes-one-unicode-character UTF-8 may have up to 6 bytes for any character. A colleague explained to me that this will work only with one and two-byte chars. Do you see a way of making it work with more? – meshy Jun 06 '13 at 13:10
  • 2 bytes fits all major web-used languages AFAIK (i'd love a contra if somone has one). i modified my answer to support 6-char escape codes, but i think 4 chars should be enough for anyone, but i'm no linguist.... – dandavis Jun 06 '13 at 15:47
  • This is prone to under-count. For the text: 'I am a 19 char text', this method counts 17 bytes, which is incorrect. You could use `encodeURIComponent(inp.value.replace(/\d/g,'X')).replace(/%[A-F\d]{2,6}/g, 'U')` instead. – jlhonora Nov 03 '16 at 23:08
  • To count bytes, this may be useful : `new Blob([str]).size` .. found [here](https://stackoverflow.com/a/52254083/2628312) – Jirka Justra May 21 '22 at 21:18
1

Throughout my own findings, I figured this works really well:

function limit_input(n) { // n = number of bytes
  return function(e) {
    const is_clipboard = e instanceof ClipboardEvent;
    if(is_clipboard && e.type != "paste") {
      return;
    }
    let new_val = e.target.value;
    if(is_clipboard) {
      new_val += e.clipboardData.getData("text");
    } else {
      new_val += e.key;
    }
    if(new TextEncoder().encode(new_val).byteLength -
       e.target.selectionEnd + e.target.selectionStart > n) {
      if(e.target.value == "" && is_clipboard) {
        const old = e.target.placeholder;
        e.target.placeholder = "Text too long to paste!";
        setTimeout(function() {
          e.target.placeholder = old;
        }, 1000);
      }
      e.preventDefault();
    }
  };
}

let el = document.getElementById("your_input");

el.onkeypress = el.onpaste = limit_input(4);

I started out with dandavis' answer and kept on improving it to adapt to all situations. I still don't think this is perfect, and it's still using the deprecated onkeypress handler, but nothing else worked better than this.

You can delete the part of the code that changes placeholder to say the text is too long to paste (delete the whole if, keep only e.preventDefault() in). It's just something I added myself to notify the user why the input is still empty after they try pasting something in. That way they won't blame me for writing faulty code and I won't have to answer a horde of complaints.

1

Create a function that returns the byte length of a string, then validate the input according to your requirements.

Here's an example that returns the byte length of a string:

function getStringByteLength(str) {
  str = typeof(str) === 'string' ? str : '';
  const byteSize = new Blob([str]).size;
  return byteSize;
}

Here's a fiddle that has a working example of calling the function above to validate two different text inputs.

Jed
  • 10,649
  • 19
  • 81
  • 125
0

If estimating isn't good enough, I'd filter all the non single-byte chars and count them.

Jonathan
  • 8,771
  • 4
  • 41
  • 78