6

Check if strings in a list can be formed by concatenation of elements in the same list

For example:

Input List -

{ best,  rockstar,   star,  guide,  bestguide, rock }

Output :-

rockstar -> rock, star

bestguide -> best, guide

Here "rockstar" can be formed using rock and star. Similarly "bestguide" can be formed by joining "best" and "guide".

Solution so far I have- Create all the combinations of string by joining each other(2 string together, 3 string together and so on) and store in a Map.

map structure could be as following

Map<String, List<String>>

{rockstar : [rock, star], ....}

Now check just traverse original list and check in the map. If it's found then it's one of possible solution.

Looking for a better solution with better time/space complexity

nagendra547
  • 5,672
  • 3
  • 29
  • 43

6 Answers6

3

I think one standard approach would probably be to construct a trie from the dictionary. Then for each candidate, walk the trie and when a matching path reaches the end (marking a smaller word), continue from the top of the trie again with the remaining suffix of the candidate. We may need a few backtracking trials per candidate if similar matches exist; but in a dictionary of only 10,000, unless the data is degenerate, these should hopefully be few on average.

גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
1

Here is the brute force approach. We can first form a list of the original terms, and then double-iterate that list to generate all combination possibilities. For each combination which is also already contained within the original list, we print that combination to the console.

String[] terms = new String[] { "best",  "rockstar",   "star",  "guide",  "bestguide", "rock" };
List<String> list = Arrays.asList(terms);
Set<String> set = new HashSet<String>(list);
for (int i=0; i < list.size()-1; ++i) {
    for (int j=i+1; j < list.size(); ++j) {
        if (set.contains(list.get(i) + list.get(j))) {
            System.out.println(list.get(i) + list.get(j) + " -> " + list.get(i) + ", " + list.get(j));
        }
        if (set.contains(list.get(j) + list.get(i))) {
            System.out.println(list.get(j) + list.get(i) + " -> " + list.get(j) + ", " + list.get(i));
        }
    }
}

This prints:

bestguide -> best, guide
rockstar -> rock, star
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Thank you for your solution. However This solution is O(n* n * n). You are using two loops (O(n* n) )and then searching in list that's another O(n) and so total O(n* n* n). My suggested solution is O(n* n). – nagendra547 Nov 08 '19 at 03:04
  • @nagendra547 and the code only checking if a string could be formed by two others or not, triples were not considered. – TaQuangTu Nov 08 '19 at 03:07
  • @nagendra547 Then use a `HashSet` to do the lookup for each possible match. This avoids having to search the list, at the possible cost of using a bit more storage space. – Tim Biegeleisen Nov 08 '19 at 03:07
  • 1
    @nagendra547 your solution would have O(2^n) complexity, not O(n^2), since you stated, "Create all the combinations of string by joining each other(2 string together, 3 string together and so on)." – גלעד ברקן Nov 08 '19 at 03:11
  • @TimBiegeleisen Yes. that's fair reason. HashSet solution becomes similar to mine – nagendra547 Nov 08 '19 at 05:14
  • @TaQuangTu Triplet can be also used. Even 4 words can be also used. I have not mentioned that only two words can be used. – nagendra547 Nov 08 '19 at 05:14
1

First, sorry for my bad English. I have a naive way, you should try it:

step 1: Sort the list in descending order of lengths of elements

step 2: In turn (from left to right of sorted list), add elements one by one to a tree under following rules:

  • Each node of the tree contains a string, the root of the tree contains nothing

  • String in each parent node contains strings in its child nodes.

enter image description here

  • step 3: Get results: if length of string in a node equals sum of lengths of strings in child nodes then we get a desired result.
TaQuangTu
  • 2,155
  • 2
  • 16
  • 30
0
  1. Use AC automation and add all strings in the set to it.

  2. Match all strings in the set with the automation and record matching point.

  3. Use dynamic programming to concatenate matching points.

Worst-case time complexity: O(n*(sum of lengths))

n comes from multiple length options to be decided on the DP process. Imagine a string set {a, aa, aaa, aaaa, ..., a^n}.

Learn AC automation here: link

tigertang
  • 445
  • 1
  • 6
  • 18
0

This is a subset sum problem. The standard solution is dynamic programming, but usually you will find solutions for integers: Subset Sum algorithm

Adapted here this would give something like:

static List<String> substrings(String s) {
    List<String> l = new ArrayList<String>();
    for(int end=1; end < s.length()+1; ++end) {
        for(int start=0; start < end; ++start) {
            l.add(s.substring(start, end));
        }
    }
    return l;
}

static boolean isInConcatenations(String target, List<String> list) {
    Set<String> set = new HashSet<String>();
    List<String> ss = substrings(target);
    set.add("");
    for (String s: list) {
        if (s == target) continue; // do not use directly 'target' if it's in the list
        Set<String> prev = Set.copyOf(set);
        for (String sub: ss) {
            if ((sub.startsWith(s) && prev.contains(sub.substring(s.length(), sub.length()))) ||
                (sub.endsWith(s) && prev.contains(sub.substring(0, sub.length()-s.length()))) ) {
                set.add(sub);
            }
        }
    }
    return set.contains(target);
}

Here substrings(s) returns the List of all non empty substrings of a string.

Complexity is roughly length(list) * length(target)**2

One Lyner
  • 1,964
  • 1
  • 6
  • 8
0

Thanks for sharing this funny exercise.

Using Java 8+ and Streams, that's the best approach to iterate list and processing small or large data sets.

Keep in mind that you can use the method :

  • inputList.stream() of list to steam a list
  • inputList.parallelStream() if your list do not contains synchronized object and not calling any synchronized method (which not allow the parallelism).

Here a nice post on DZone to understand the Stream API performance https://dzone.com/articles/java-performance-for-looping-vs-streaming

            final String input = "best,rockstar,star,guide,bestguide,rock,fake,rockfaller";

        // Start to finding input pairs
        List<String> inputList = Arrays.asList(input.split(","));
        List<String> combi = inputList.stream()
                .filter(s -> input.contains(s) && input.lastIndexOf(s) != input.indexOf(s))
                .collect(Collectors.toList());

        // Build ouput
        final HashMap<String, List<String>> output = new HashMap<>();
        inputList.stream()
                // Remove pair words 
                .filter(s -> !combi.contains(s)) 
                .filter(s -> combi.stream().anyMatch(pair -> s.startsWith(pair) || s.endsWith(pair)) )
                .forEach( s -> {
                    List<String> result = combi.stream()
                            .filter(pair -> s.startsWith(pair) || s.endsWith(pair))
                            // Sort the output result
                            .sorted((s1, s2) ->  s.startsWith(s1) ? 0 : 1)
                            .collect(Collectors.toList());
                    Collections.sort(result);

                    if(result.size() > 1)
                    {
                        output.put(s, result);
                    }
                });

        System.out.println(output);

And this is the output when you print the HashMap result

{bestguide=[best, guide], rockstar=[rock, star]}

bdzzaid
  • 838
  • 8
  • 15