哈希

哈希

例题 - Finding Periods
CSES - 普及+/提高
例题 - 请努力完成本题,再阅读以下内容。解答

哈希(hashing)也叫散列,是用来快速判断数据(例如字符串)之间是否相同的方法。在哈希中,我们不是一个个比较字节(字符),而是比较两个串的哈希值。

字符串ss的多项式哈希如下:

(s[0]An1+s[1]An2+...+s[n])modM(\texttt{s}[0]A^{n-1}+\texttt{s}[1]A^{n-2}+...+\texttt{s}[n]) \mod M

常数的常见值:A=911382323A=911382323, M=972663749M=972663749,或者更容易记忆的:A=69420A=69420, M=109+9M=10^9+9

有时我们需要快速计算SS的任意子串的哈希,这时可以预处理h[]\texttt{h}[]p[]\texttt{p}[]

h[0]=s[0]h[k]=(h[k1]A+s[k])modMp[0]=1p[k]=p[k1]AmodM\texttt{h}[0] = \texttt{s}[0] \\ \texttt{h}[k] = (\texttt{h}[k-1]A + \texttt{s}[k]) \mod M \\ \texttt{p}[0] = 1 \\ \texttt{p}[k] = \texttt{p}[k-1]A \mod M

S[a:b]S[a:b]的哈希值为:

(h[b]p[ba+1]h[a1])modM(\texttt{h}[b] - \texttt{p}[b-a+1] \cdot \texttt{h}[a-1]) \mod M
资源
CPH26.3 - String Hashing
OI-Wiki字符串哈希

题解 - Finding Periods

我们可以逐一测试所有可能的周期(1..n1 .. n),那么总共需要的子字符串次数是n(1+12+13+...+1n)n(1+\frac{1}{2} + \frac{1}{3} + ... + \frac{1}{n}), 括号内是一个调和级数, 所以总比较次数大小规模为 nlnnn \ln n,如果比较子字符串相等的操作是O(1)\mathcal{O}(1)的(通过哈希达到),则整个算法为O(nlogn)\mathcal{O}(n \log n)

下面算法中加了一步倍增的优化(len *= 2),复杂度更低一些,不加这个优化也是可以的。

#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
typedef pair<int,int> pi;
 
int n;
const int MAXN = 1000005;
char s[MAXN];
int M = 972663749;
int A = 911382323;
int h[MAXN], p[MAXN];
 
int hsh(int a,int b) {     // [a,b]
    int res;
    if (a == 0) res = h[b];
    else res = (h[b] - (ll)h[a-1]*p[b-a+1] + (ll)h[a-1]*M) % M;
    return res;
}
 
int main() {
    scanf("%s", s);
    n = strlen(s);
    for (int i = 0; i < n; i++) {
        h[i] = i ? ((ll)h[i-1] * A + s[i]) % M : s[0];
        p[i] = i ? ((ll)p[i-1]*A)%M : 1;
    }
    for (int i = 1; i < n; i++) {   // test i as period
        bool ok = true;
        for (int len = i; len < n; len *= 2) {
            int len2 = min(n-len, len);
            if (hsh(0, len2-1) != hsh(len, len+len2-1)) {
                ok = false;
                break;
            }
        }
        if (ok) printf("%d ", i);
    }
    printf("%d", n);
 
    return 0;
}
习题
P4656CEOIPalindromic Partitions2017普及+/提高解答
P3667USACO GoldLights Out2016普及+/提高解答
CF427DCFPalindromic Characteristics2017普及+/提高解答
CFCFCheck Transmission普及+/提高解答
P4824USACO SilverCensoring2015提高+/省选-解答
P3667USACO GoldBovine Genomics2017提高+/省选-解答
P7538COCIOsmosmjerka2016提高+/省选-解答