4

If there are two sets -

set1 - [tag, boLD, Link]
set2 - [BOLd, TAG, Badge, foo]

What could be the efficient algorithm for making pairs of elements like -

pairs = [tag, TAG], [boLD, BOLd], [Link, null], [null, Badge], [null, foo]

Notice the pairing is on the basis of case-insensitive names.

I want to avoid O(N^2), that looks up all elements in set1 iteratively, and look in the element in set2.

EDIT: I think if we can use Ternary Search Tries, to make symbol table implementation, where keys are elements from set1, and values from set2. set2 remaining elements could be dealt at last.

unknown_boundaries
  • 1,482
  • 3
  • 25
  • 47

3 Answers3

3

You can do it in O(n), if you use some data structure that supports O(1) get operations - for example HashMap.

    HashMap<String, String> set1 = new HashMap<>();
    HashMap<String, String> set2 = new HashMap<>();
    class Pair{
        String str1;
        String str2;

        Pair(String s1, String s2){
            str1 = s1;
            str2 = s2;
        }
    }
    Set <Pair> pairs = new HashSet<>();
    set1.put("tag", "tag");
    set1.put("bold", "boLD");
    set1.put("link", "Link");
    set2.put("tag", "TAG");
    set2.put("bold", "BOLd");
    set2.put("badge", "Badge");
    set2.put("foo", "foo");

    for (String s : set1.keySet()){
        if (set2.containsKey(s))
            pairs.add(new Pair(set1.get(s), set2.get(s)));
        else
            pairs.add(new Pair(set1.get(s), null));
    }

    for (String s : set2.keySet()){
        if (!set1.containsKey(s))
            pairs.add(new Pair(null, set2.get(s)));
    }

    for(Pair p : pairs)
        System.out.println(p.str1 + " " + p.str2);
Grisha Weintraub
  • 7,803
  • 1
  • 25
  • 45
1

You could just create a HashMap and iterate over both arrays. For the key, you may use the lowercase representation of each string. This should give you linear runtime complexity and is very easy to implement.

timcbaoth
  • 669
  • 1
  • 7
  • 12
0

A simple solution (given in Python, too lazy to write Java now) which is linear in the lengths of the lists would be the following:

map1 = {}
for i,e in enumerate(set1):
    s = e.lower()
    map1[s] = i

map2 = {}
for i,e in enumerate(set2):
    s = e.lower()
    map2[s] = i

pairs = []
for i,e in enumerate(set1):
    s = e.lower()
    if s in map2:
        elem = (e, set2[map2[s]])
    else:
        elem = (e, None)
    pairs.append(elem)

for i,e in enumerate(set2):
    s = e.lower()
    if s not in map1:
        pairs.append((None, e))

I'm also assuming there are no duplicates in the input, i.e. either it matches 0 or 1 words from the other set and there is not the same word with different cases in each of the sets, as I'm not sure what you would expect your output to be then.

It's probably not the most efficient way to go about it, but seems fine overall as it's still O(n).

Martin Krämer
  • 567
  • 3
  • 17