Java Program For Shortest Unique Prefix Problem

Shortest Unique Prefix is an interview question which you can find in websites like GeeksForGeeks & InterviewBit. It is asked in Google interviews. The problem is as follows:

Find shortest unique prefix to represent each word in the list.
Input: [zebra, dog, duck, dove]
Unique Prefix of each word is as below:
zebra = z
dog = dog
duck = du
dove = dov
Output: {z, dog, du, dov}

There are few different ways to solve this problem. We will use prefix tree or trie data structure. The solution would be intuitive & easy to explain if we use trie here.

Let’s take a look at the diagram above. We are using a modified version of trie here. Root node doesn’t contain any character. It is just starting point of the trie tree. All other nodes represent a character. Nodes at first level represent first characters of the input words. Second level nodes represent second characters of the input words. Each node also has a count field which represents total number of words which have common prefix till that level.

For example, “d” character is present at first position in three words “dog”, “duck” & “dove”. So we have (d, 3) node as left child of root. (d, 3) node has two children. Left child is (o, 2) node because both “dog” & “dove” words have “o” character in second position. Common prefix for “dog” and “dove” is “do”. Right child of (d, 3) node is (u, 1) node because only “duck” has “u” character in second position.

Once we construct the trie, it is easy to find shortest unique prefixes. We just need to take a word, start from the first letter & do a depth-first traversal in the trie tree till we find a node where letter count is 1. That path would be the shortest unique prefix for this word. If letter count is more than 1, then we have a common prefix till that path. We already discussed the same above.

Worst time complexity of this solution would be O(m * n) where m is average number of characters in a word & n is total number of words. For trie construction, we would need to traverse m * n characters. Also finding unique prefix for a word might take m node traversals if only last character is unique. So for n words, finding unique prefix would take m * n. That’s why worst time complexity is 2( m * n) which is O(m * n). Extra space usage is optimized as common characters at a particular level will occupy a single node only.

Here is the complete Java solution of this InterviewBit problem.

public class Solution {
    public ArrayList<String> prefix(ArrayList<String> A) {
        TrieNode root = new TrieNode();
        for(String s : A){
            root.insert(s, root);
        }
        ArrayList<String> results = new ArrayList<String>();
        for(String s : A){
            List<Character> path = new ArrayList<Character>();
            getPrefixes(root, results, path, s, 0);
        }
        return results;
    }
    
    private void getPrefixes(TrieNode node, List<String> results, List<Character> path,
    String word, int index){
        TrieNode curr = node.children.get(word.charAt(index));
        path.add(curr.value);
        if(curr.count == 1){
            StringBuilder sb = new StringBuilder();
            for (Character c: path) {
                sb.append(c);
            }
            results.add(sb.toString());
        } else {
            getPrefixes(curr, results, path, word, index + 1);
        }
    }
    
    static class TrieNode {
        Map<Character, TrieNode> children = new HashMap<Character, TrieNode>();
        int count = 0;
        char value;
        
        public void insert(String word, TrieNode root){
            TrieNode curr = root;
            for(char c : word.toCharArray()){
                TrieNode child = null;
                if(curr.children.containsKey(c)){
                    child = curr.children.get(c);
                    child.count = child.count + 1;
                } else {
                    child = new TrieNode();
                    child.count = child.count + 1;
                    child.value = c;
                    curr.children.put(c, child);
                }
                curr = child;
            }
        }
    }
}

Leave a Comment