9

I have various string comparison and diff algorithms, but at some point, before I apply them I want to find if two strings have at least one character in common. This way I will be able to skip more complex functions. So I need a very fast function in JavaScript that will find if string A and string B has at least one common character.

First I was thinking to create a character map for a string A, and then check every character in a string B against that map until something is found. But then I realized that if both strings are huge and they have a common first character, this will be inefficient to create a full map for string A.

UPDATE: someone answered with using indexOf(), this confuses me. Maybe the phrase "have character in common" means the same as "string is a substring of another"? Let me give an example of what I want:

For example JavaScript and Stop and stay have a character S in common. Other example would be please look right and break the ice they have a character k in common.

exebook
  • 32,014
  • 33
  • 141
  • 226

9 Answers9

5

Simplest method would be to loop through each letter in one string and see if the other contains any of the single letters.

You can't really get more efficient than to go over each letter unless the strings are sorted in alphabetical order to start with.

function anythingInCommon(a, b){
    if( b.length < a.length )
        return anythingInCommon(b, a)

    for( var i = 0, len = a.length; i < len; i++ ) 
        if(b.indexOf(a[i]) != -1)
            return true;
  
    return false
}

console.log(
  anythingInCommon("aaaaaaaaaaaaaabbbbbbccccc", "xc"),
  anythingInCommon("aaaaaaaaaaaaaabbbbbbccccc", "x")
)
anythingInCommon('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa','cddabddde')

Fiddler: http://jsfiddle.net/LRxGK/4/

vsync
  • 118,978
  • 58
  • 307
  • 400
JensB
  • 6,663
  • 2
  • 55
  • 94
  • 2
    +1 for keeping it simple. You could also do a check on `stra` and `strb` and iterate over whichever is shortest – maček Dec 19 '13 at 07:29
  • 1
    It would also be more effective to loop all the letters in the alphabet and check both strings for every letter if the strings are long enough – krs Dec 19 '13 at 07:35
  • @maček Updated with your idea of looping over the shortest string. – JensB Dec 19 '13 at 07:36
  • @krs hmm very true... all depends on how long the strings are and if their charset is always known (ie always english or some other language..) – JensB Dec 19 '13 at 07:39
  • 2
    Rather than using the `temp` variable, you could rewrite the `if` as `if (strb.length < stra.length) return anythingInCommon(strb, stra)`. This just recalls the function with the args in the opposite order. This way, if either string is a large one, we wouldn't waste memory by duplicating strings with `temp`. – maček Dec 19 '13 at 15:01
3

You can create a cache of checked characters if you're worried about the (likely minimal) overhead of searching the same character twice. Something like this function may suit the bill:

var common_char = function(str1, str2) {
    var map = {};
    return Array.prototype.some.call(str1, function(c) {
        if( !map[c] && str2.indexOf(c) >= 0 ) {//character c not checked and in str2
            return true;
        }
        map[c] = true;
    });
}
megawac
  • 10,953
  • 5
  • 40
  • 61
2

one way ,

actually extending your map approach

create two bit Arrays for your character set, say for a-z you create a bit array of 26 , and whatever character you encounter you set the flag to 1. so you read string A one character at a time and lookup in string B's flagArray to see if the corresponding bit is on,(else set this bit as 'on' in string A's flagArray), in the same iteration, do it for string B's current character and A's flagarray, if none of them matches, then set relevant bits on for the current character in both the bitarrays

gaurav5430
  • 12,934
  • 6
  • 54
  • 111
1

Strings works like arrays, what we want is to mimick a set.intersect in javascript to get common letters of both containers, as JS doesnt have native sets you can check this answer:

Simplest code for array intersection in javascript

Adding on the other answers, for a fast way to do it quick and dirty if the strings are already sorted:

  1. check the first letter in each sorted string, if they are equal. exit truthy

  2. pop the lowest of the two first characters

  3. if any of the strings are empty, return false

  4. goto 1.

Community
  • 1
  • 1
krs
  • 4,096
  • 19
  • 22
  • Not sure we can assume the strings are sorted, but if they are this would probably be best. If they arent sorted we would have to deal with that first, and that could cost quite a bit too. – JensB Dec 19 '13 at 07:42
  • 1
    yeah, as others already gave the answers for unsorted strings i added one for sorted – krs Dec 19 '13 at 07:43
1

Build a suffix tree or use dynamic programming may solve this
See the Longest common substring problem, it may be helpfull.

Arthur
  • 54
  • 2
1

Everybody is gonna yell at me for using a regexp but it makes life SOOOOOO simple

var str1 = 'Stop and stay'
,   str2 = 'JavaScript'.replace(/(.)(?=.*\1)|\s/g, "")
// replace(): removes space & repeated characters

,   reg = new RegExp("["+ str2 +"]", 'g')
// global flag to return a list of matched characters
;

console.log(str1.match(reg));
// array('S', 't', 'a', 'p', 't', 'a')

JSFIDDLE

Jay Harris
  • 4,201
  • 17
  • 21
1

This answer draws on gaurav5430 and megawac's answers.

This algorithm constructs a set that contains every character that has appeared in either of the strings.

  • It runs in O(n)
  • It does not require the strings to be sorted
  • It examines both strings at the same time
  • It will terminate as soon as it reaches a point where any character exists in both strings, even if that character is the first character.
  • Here is a JSFiddle example.
function common_char(left, right) {
  var left_map = {};
  var right_map = {};
  var index = 0;

  while (index < left.length || index < right.length) {

    // Check left array
    if (index < left.length) {
      var c = left[index];

      left_map[c] = true;

      // Check if it exists in the other map
      if (right_map[c]) {
        return true;
      }
    }    

    // Check right array
    if (index < right.length) {
      var c = right[index];

      right_map[c] = true;

      // Check if it exists in the other map
      if (left_map[c]) {
        return true;
      }
    }

   index++;
  }

  return false;
}
Gustav Bertram
  • 14,591
  • 3
  • 40
  • 65
0

If you're searching for elements in containers where the elements can contain arbitrary values, then as the containers are not sorted, you're unlikely to get this done faster than doing a linear search O(N^2), e.g. by using two nested for loops to iterate over the elements in the strings. As you've undoubtedly noticed, this will be rather slow in JavaScript.

However, the characters are typically part of a limited set (e.g. 128 chars for ASCII). So it may be worth while building up two maps, one for each of the strings, and to insert characters from the strings into the respective maps as you're doing the search. That way, you can check if a character has been previously seen in either strings map before comparing the next character for a match. Depending on your typical string length, this may be faster then doing a simple linear search.

Pre-sorting your strings fully before searching (e.g. by using maps) may also still be an option if you will be comparing the same string against other strings multiple times, because then the initial cost of sorting will be amortised over subsequent searching.

Rob
  • 1,350
  • 12
  • 21
0

Came up with my own solution, the idea is to compare against the largest map at the moment. It is similar to Gustav Bertram approach some what, but adaptively selects which character to compare next, from string A or from string B.

function haveCommonChar(a, b) {
    var mapa = [], mapb = [], mappeda = 0, mappedb = 0, x = 0, y = 0
    while(x < a.length && y < b.length) { // smart part, while both strings still have more chars
        if (mappeda >= mappedb) { //compare against the largest map
            // one way to speed up this part even further, will be to work in chunks of fixed size larger than 1(now it's 1)
            var c = b.charCodeAt(y++)
            if (mapa[c] === 1) return true
            if (mapb[c] !== 1) mapb[c] = 1, mappedb++
        } else {
            var c = a.charCodeAt(x++)
            if (mapb[c] === 1) return true
            if (mapa[c] !== 1) mapa[c] = 1, mappeda++
        }
    }
    //quickly finish the remaining chars
    while(x < a.length) if (mapb[a.charCodeAt(x++)] !== undefined) return true
    while(y < b.length) if (mapa[b.charCodeAt(y++)] !== undefined) return true
    return false
}
exebook
  • 32,014
  • 33
  • 141
  • 226