🔗 🔴 1255. Maximum Score Words Formed by Letters 1882

tags: `Weekly Contest 162` `回溯(Backtracking)` `剪枝(Pruning)` `子集型回溯` `位運算(Bit Manipulation)` `狀態壓縮` `雜湊表(Hash Table)`

題意

給定一個字串陣列 $words$ 表示單字列表、一個字元陣列 $letters$ 表示可使用的字母(可能會有重複字母)、和一個長度為 $26$ 的整數陣列 $score$ 表示每個字母的分數，其中字母 皆為小寫英文字母。

返回在給定的單字列表中，選擇其任何有效子集合，所能獲得的最大分數。每個單字 $words[i]$ 只能使用一次，每個字母 $letters[i]$ 也只能使用一次。

限制

$1 \leq words.length \leq 14$
$1 \leq words[i].length \leq 15$
$1 \leq letters.length \leq 100$

思路

注意題目的數據範圍， $1 \leq n = words.length \leq 14$ ，枚舉所有可能的子集的時間複雜度為 $O(2^n)$ ，因此可以使用回溯法或直接枚舉所有可能的子集解決。

方法一：回溯(Backtracking) + 剪枝(Pruning)

建立一個雜湊表 $cnt$ 來保存 $letters$ 中每個字母的數量，即每個字母的可用次數。此外，為方便起見，將每個字母的分數也用雜湊表 $scores$ 保存、或是直接以字元的 ASCII 碼減去 $'a'$ 作為索引。

使用 $cur$ 保存當前子集中的分數， $ans$ 保存最大分數，在遞迴時分別考慮每個單字 $words[i]$ 選或不選兩種情況：

若不選 $words[i]$ ，則直接遞迴到下一個單字。
若選 $words[i]$ $w or d s [i]$ ，則先在 $cnt$ $c n t$ 中減去 $words[i]$ $w or d s [i]$ 的字母數量，若不能選 $words[i]$ $w or d s [i]$ 則直接恢復，若能選則是先遞迴後恢復。
- 若發生不能選 $words[i]$ 的情況，則顯然選擇 $words[i]$ 和之後的單字都是不合法的，因此不用考慮這些情況，此作法即為剪枝。

遞迴終點為 $i = n$ 時，即所有單字都考慮過了，此時更新答案。在遍歷完所有路徑後，最後返回 $ans$ 即可。

複雜度分析

時間複雜度 $O(m + L \cdot 2^n)$ ，其中 $n$ 為 $words$ 的長度， $m$ 為 $letters$ 的長度， $L$ 為 $words$ 中最長單字的長度。
空間複雜度 $O(n + \Sigma)$ ，其中 $\Sigma$ 為字元集大小，本題中為 $26$ 。

class Solution:
    def maxScoreWords(self, words: List[str], letters: List[str], score: List[int]) -> int:
        n = len(words)
        cnt = Counter(letters) # 每個字母可使用的次數
        scores = {chr(i + 97): score[i] for i in range(26)} # 每個字母的分數
        ans, cur = 0, 0
        def dfs(i: int) -> None:
            nonlocal ans, cur
            if i == n:
                ans = max(ans, cur) # 更新答案
                return
            dfs(i + 1) # 不選 words[i]
            # 先減去 words[i] 的字母數量，若不能選 words[i] 則直接恢復，若能選則是遞迴後恢復
            for ch in words[i]:
                cnt[ch] -= 1
                cur += scores[ch]
            if all(cnt[ch] >= 0 for ch in cnt): # 判斷是否可以選 words[i]
                dfs(i + 1) # 選 words[i]
            for ch in words[i]:
                cnt[ch] += 1
                cur -= scores[ch]
        dfs(0)
        return ans

class Solution {
public:
    int maxScoreWords(vector<string>& words, vector<char>& letters, vector<int>& score) {
        int n = words.size();
        vector<int> cnt(26);
        for (char ch : letters) cnt[ch - 'a']++;
        int ans = 0;
        function<void(int, int)> dfs = [&](int i, int cur) {
            if (i == n) {
                ans = max(ans, cur);
                return;
            }
            dfs(i + 1, cur); // 不選 words[i]
            for (char ch : words[i]) cnt[ch - 'a']--;
            bool flag = true;
            for (char ch : words[i]) {
                if (cnt[ch - 'a'] < 0) {
                    flag = false;
                    break;
                }
                cur += score[ch - 'a'];
            }
            if (flag) dfs(i + 1, cur); // 選 words[i]
            for (char ch : words[i]) cnt[ch - 'a']++;
        };
        dfs(0, 0);
        return ans;
    }
};

方法二：狀態壓縮，枚舉所有可能的子集

對於子集型回溯的問題，同樣可以嘗試使用位運算來枚舉所有可能的子集。

枚舉所有可能的子集 $i$ ，若 i & (1 << j) ，則代表 $words[j]$ 在當前子集中。在枚舉每個子集時，建立一個雜湊表或長度為 $26$ 的陣列 $tmp$ 來保存當前子集中需要的字母數量，同樣使用 $cur$ 保存當前子集的分數。

若在添加 $words[j]$ 所需要的字母數量到 $tmp$ 後，出現 $tmp[ch] > cnt[ch]$ 的情況，則這個子集不合法，直接跳過。
若所有 $words[j]$ 都能選，則更新答案。

在 Python 中的 for...else 語法可以用來判斷 for 迴圈是否正常結束，若正常結束則執行 else 語句，這樣可以省去一個 flag 變數。

狀態壓縮的作法，相較回溯+剪枝的優缺點如下：

會考慮每個可能的子集，例如若 $00001111$ 不合法，則在回溯中不會考慮 $XXXX1111$ 的情況，但在枚舉所有可能的子集的做法中，還是會將其再考慮一次。
此外，在回溯中， $cnt$ 可以被重複利用，而在枚舉所有可能的子集的做法中，需要每次都建立一個新的 $tmp$ 來保存當前子集需要的字母數量。
但狀態壓縮的作法在簡潔性上更勝一籌，且在本題的數據範圍下，枚舉所有可能的子集的時間複雜度也是可以接受的。

複雜度分析

時間複雜度 $O(m + (S + \Sigma) \cdot 2^n)$ ，其中 $n$ 為 $words$ 的長度， $m$ 為 $letters$ 的長度， $S$ 為 $words$ 中所有單字的總長度， $\Sigma$ 為字元集大小，本題中為 $26$ 。
空間複雜度 $O(\Sigma)$ 。

class Solution:
    def maxScoreWords(self, words: List[str], letters: List[str], score: List[int]) -> int:
        n = len(words)
        cnt = Counter(letters) # 每個字母可使用的次數
        scores = {chr(i + 97): score[i] for i in range(26)} # 每個字母的分數
        ans = 0
        for i in range(1, 1 << n): # 枚舉所有可能的子集
            tmp = Counter() # 當前子集需要的字母數量
            cur = 0 # 當前子集的分數
            for j in range(n):
                if i & (1 << j): # words[j] 在當前子集中
                    for ch in words[j]:
                        tmp[ch] += 1
                        cur += scores[ch]
                    if any(tmp[ch] > cnt[ch] for ch in tmp): # 這個子集不合法。這樣寫是為了不用 flag 變數
                        break
            else: # 若所有 words[j] 都能選，則更新答案
                ans = max(ans, cur)
        return ans

class Solution {
public:
    int maxScoreWords(vector<string>& words, vector<char>& letters, vector<int>& score) {
        int n = words.size();
        vector<int> cnt(26); // 每個字母可使用的次數
        for (char ch : letters) cnt[ch - 'a']++;
        int ans = 0;
        for (int i = 1; i < (1 << n); i++) { // 枚舉所有可能的子集
            vector<int> tmp(26); // 當前子集需要的字母數量
            int cur = 0; // 當前子集的分數
            bool flag = true;
            for (int j = 0; j < n && flag; j++) {
                if (i & (1 << j)) { // words[j] 在當前子集中
                    for (char ch : words[j]) {
                        tmp[ch - 'a']++;
                        if (tmp[ch - 'a'] > cnt[ch - 'a']) { // 當前子集需要的字母數量超過可用的字母數量，不合法
                            flag = false;
                            break;
                        }
                        cur += score[ch - 'a'];
                    }
                }
            }
            if (flag) ans = max(ans, cur); // 若所有 words[j] 都能選，則更新答案
        }
        return ans;
    }
};