3333: Find-the-Original-Typed-String-II
Hard
table of contents
For this question, first if we denote word
to
have n
distinct sections of letters
(e.g. aabbbbccdd
has 4 distinct sections of letters), then
we know that at a bare minimum, the possible original string must be
at least of length n
.
First off for this question, realise to calculate the total possible original strings, it just requires us to multiply together the frequencies of each segment.
Thus, it makes sense to pre-calculate the frequency
of each segment and store it into a freq
array (to use
later). For example, the freq
array for string
aabbbbccdd
would be [2, 4, 2, 2]
.
Then, given this freq
array, it allows us to formulate a
dp algorithm. To define the recurrence relationship,
let us iterate through freq
using index i
,
where 0 <= i < freq.size()
, and define
dp_(i)[j]
as the number of words that can be formed of
length j
using only the first i
sections of
letters, where 0 <= j < k
.
Now, if we iterate to index i+1
in the recurrence
relation, we are adding freq[i+1]
of the letter, let us
call it a
, into the pool. This means to calculate
dp_(i+1)[j]
, its the same as summing up
dp_(i)[j-1]
, dp_(i)[j-2]
, … and
dp_(i)[j-freq[i+1]]
(let us assume for simplicity that
j > freq[i+1]
). This is because we can simply:
- add one
a
to every unique word of lengthj-1
indp(i)[j-1]
to get a unique word of lengthj
- add two
a
to every unique word of lengthj-2
indp(i)[j-2]
to get a unique word of lengthj
- …
- add
freq[i+1]
a
to every unique word of lengthj-freq[i+1]
indp(i)[j-freq[i+1]]
to get a unique word of lengthj
For example, using this algorithm on aabbbbccdd
, for
k = 6
, on each iteration. Initially our array should look
like:
dp(-1) = [1, 0, 0, 0, 0, 0]
since we not “added” any letters into the pool (adding
-1
for clarity).
After iterating through the entirety of k
, we can simply
just delete from the number of total possible original
strings, (which we can just calculate by multiplying together the
frequencies of each segment) the number of possible strings
that are smaller than k
, which should just
be the sum of all the elements of dp
(since it is of length
k
, it contains the count of possible strings of length
0
to length k-1
).
code
class Solution {
private:
long long MOD = 1e9+7;
public:
int possibleStringCount(string word, int k) {
<int> freq;
vectorint currFreq = 1;
for (int i = 1; i < word.size(); ++i) {
if (word[i] == word[i-1]) {
++currFreq;
} else {
.push_back(currFreq);
freq= 1;
currFreq }
}
.push_back(currFreq);
freq
long long ans = 1;
for (int i = 0; i < freq.size(); ++i) {
= (ans * freq[i]) % MOD;
ans }
if (freq.size() > k) {
return ans;
}
<int> dp(k, 0);
vector[0] = 1;
dp
for (int i = 0; i < freq.size(); ++i) {
<int> newDp(k);
vectorlong long counter = 0;
for (int j = 0; j < k; ++j) {
if (j > 0) {
= (counter + dp[j-1]) % MOD;
counter }
if (j > freq[i]) {
= (counter - dp[j-freq[i]-1] + MOD) % MOD;
counter }
[j] = counter;
newDp}
= newDp;
dp }
for (int i = freq.size(); i < k; ++i) {
= (ans - dp[i] + MOD) % MOD;
ans }
return ans;
}
};