Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#'
). For each character they type except '#', you need to return the top 3 historical hot sentences that have prefix the same as the part of sentence already typed. Here are the specific rules:
Your job is to implement the following functions:
The constructor function:
AutocompleteSystem(String[] sentences, int[] times):
This is the constructor. The input is historical data. Sentences
is a string array consists of previously typed sentences. Times
is the corresponding times a sentence has been typed. Your system should record these historical data.
Now, the user wants to input a new sentence. The following function will provide the next character the user types:
List<String> input(char c):
The input c
is the next character typed by the user. The character will only be lower-case letters ('a'
to 'z'
), blank space (' '
) or a special character ('#'
). Also, the previously typed sentence should be recorded in your system. The output will be the top 3 historical hot sentences that have prefix the same as the part of sentence already typed.
Example:
Operation: AutocompleteSystem(["i love you", "island","ironman", "i love leetcode"], [5,3,2,2])
The system have already tracked down the following sentences and their corresponding times:"i love you"
: 5
times"island"
: 3
times"ironman"
: 2
times"i love leetcode"
: 2
times
Now, the user begins another search:
Operation: input('i')
Output: ["i love you", "island","i love leetcode"]
Explanation:
There are four sentences that have prefix "i"
. Among them, "ironman" and "i love leetcode" have same hot degree. Since ' '
has ASCII code 32 and 'r'
has ASCII code 114, "i love leetcode" should be in front of "ironman". Also we only need to output top 3 hot sentences, so "ironman" will be ignored.
Operation: input(' ')
Output: ["i love you","i love leetcode"]
Explanation:
There are only two sentences that have prefix "i "
.
Operation: input('a')
Output: []
Explanation:
There are no sentences that have prefix "i a"
.
Operation: input('#')
Output: []
Explanation:
The user finished the input, the sentence "i a"
should be saved as a historical sentence in system. And the following input will be counted as a new search.
Note:
----------------------------------------------------------------------
Trie树很容易想到,写的过程中有几点反应慢的地方:
1. DefaultDict除了能放default类型的元素,放其他类型也没问题,判断一下就好
2. 某个子节点的频率更新了,从根到这个子节点整个路径上的频率都需要更新,可以遍历的时候把路径保存下来。
以下是完整的codes:
from collections import defaultdict
class AutocompleteSystem:
def __init__(self, sentences, times):
self.accu = ""
l = len(sentences)
add_child = lambda:defaultdict(add_child)
self.trie = defaultdict(add_child)
self.rt = self.trie
for i in range(l):
self.add_node([-times[i],sentences[i]])
def add_node(self, freq):
def update(top1, freq1):
l = len(top1)
for i in range(l):
if (top1[i][1] == freq1[1]):
top1[i][0] = freq1[0] #bug2: top1[i][0] += freq1[0],最下面节点已经是new_freq,上面直接update就好
top1.sort()
return top1
if (l == 3 and top1[2] > freq1):
top1[2] = list(freq1)
elif (l < 3):
top1.append(list(freq1))
top1.sort()
return top1
rt,path = self.trie,[self.trie]
for ch in freq[1]:
rt = rt[ch]
path.append(rt)
newfreq = [freq[0]+(rt["$"][0] if "$" in rt else 0), freq[1]]
rt["$"] = newfreq
for node in path:
node["top"] = update(node["top"] if "top" in node else [],newfreq)
def input(self, c):
if (c == '#'):
self.add_node([-1, self.accu])
self.accu = ""
self.rt = self.trie
return []
else:
self.accu += c #bug1: forget
self.rt = self.rt[c]
top = self.rt['top'] if 'top' in self.rt else []
return [lst[1] for lst in top]