2

I'm working on a script to create metrics for online author identification. One of the things I came across in the literature is to count the frequency of each letter (how many a's, how many b's, etc) independent of upper or lower case. Since I don't want to create a separate statement for each letter, I'm trying to loop the thing, but I can't figure it out. The best I have been able to come up with is converting the ASCII letter code in to hex, and then...hopefully a miracle happens.

So far, I've got

element = id.toLowerCase();
var hex = 0;
for (k=97; k<122; k++){
    hex = k.toString(16); //gets me to hex
    letter = element.replace(/[^\hex]/g, "")//remove everything but the current letter I'm looking for
    return letter.length // the length of the resulting string is how many times the ltter came up
}   

but of course, when I do that, it interprets hex as the letters h e x, not the hex code for the letter I want.

j08691
  • 204,283
  • 31
  • 260
  • 272
bigbenbt
  • 367
  • 2
  • 14
  • This is not the best approach; look at @ElliotBonneville's answer. But to answer the specific question, if you want to build a regex from variable components, use `new RegExp(string)` instead of a regex literal: `var hexRegex = new RegExp("[^\\" + hex + "]", "g");` – Mark Reed Apr 23 '12 at 19:03
  • @MarkReed: You might need to [escape](http://stackoverflow.com/questions/3561493/is-there-a-regexp-escape-function-in-javascript/3561711#3561711) your string if you want to do that with general characters. – hugomg Apr 23 '12 at 19:15

1 Answers1

5

Not sure why you'd want to convert to hex, but you could loop through the string's characters and keep track of how many times each one has appeared with an object used as a hash:

var element = id.toLowerCase();
var keys = {};

for(var i = 0, len = element.length; i<len; i++) {
    if(keys[element.charAt(i)]) keys[element.charAt(i)]++;
    else keys[element.charAt(i)] = 1;
}

You could use an array to do the same thing but a hash is faster.

Elliot Bonneville
  • 51,872
  • 23
  • 96
  • 123
  • You can also use element.charCodeAt() if you feel like using numeric codes. – hugomg Apr 23 '12 at 19:01
  • There's really no reason to as you'd just have to convert back to see which character is which, but yes, that's easily doable. – Elliot Bonneville Apr 23 '12 at 19:06
  • wow. That's much slicker than what I was trying. I'm coming at this from a Matlab and mathematica background, so going this way never occurred to me. Thanks. – bigbenbt Apr 24 '12 at 18:30
  • @bigbenbt: If this answer is useful to you, please mark it as accepted with the checkmark to the right of it. =) – Elliot Bonneville Apr 25 '12 at 01:55