一道有關hash的POJ題目:POJ1200 Crazy Search

一、題目描述

    這裏把題目粘過來吧,網頁是在http://poj.org/problem?id=1200,題目描述如下:

    Description:
    Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that exist in the text. As you soon will discover, you really need the help of a computer and a good algorithm to solve such a puzzle. 
    Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text. 

    As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5. 
    Input:

    The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set of characters does not exceed 16 Millions.
    Output:

    The program should output just an integer corresponding to the number of different substrings of size N found in the given text.
    Sample Input:
    3 4
    daababac
    Sample Output:
    5
    Hint:

    Huge input,scanf is recommended.

    題目大意就是將一個字符串分成長度爲N的子串。且不同的字符不會超過NC個。問總共有多少個不同的子串。最後採用的辦法就是以NC作爲進制,把一個字符子串轉換爲這個進制下的數,再用哈希判斷。由於題目說長度不會超過16000000,所以哈希長度就設爲16000000就行。另外爲每一個字符對應一個整數,來方便轉化。

    比如題目中的daababac與整數對應之後就是12232324,然後子串分別可以轉換爲下列各數:
    daa->122->011(因爲是化爲4進制,所以需要減1)->5(將轉換後的4進制數計算爲10進製作爲此子串的hash索引值);
    aab->223->112->22;

    aba->232->121->25;
    時間複雜度爲O(n)。代碼實現如下所示:

#include<stdio.h>
#include<string.h>
#define mem(a) memset(a,0,sizeof(a))

unsigned int hash[16000000+5];
unsigned int c[128];
char str[1000000];

int main()
{
    int len,base;
    while(~scanf("%d%d",&len,&base))
    {
        mem(str);
        mem(c);
        mem(hash);
        scanf("%s",str);
        int num =0;
        int i,j=0,length=strlen(str),tp=1;
        for(i=0;i<length;i++)
        {
            if(c[str[i]]==0)c[str[i]]=++j;
            if(j==base)break;
        }
        for(i=0;i<len;i++)
        {
            num=num*base+c[str[i]]-1;
            tp*=base;
        }
        tp/=base;
        hash[num]=1;
        int count=1;
        for(i=1;i<=length-len;i++)
        {
            num = ( num-(c[str[i-1]]-1)*tp )* base+ c[str[i+len-1]] - 1;
            if(!hash[num])
            {
                hash[num]=1;
                count++;
            }
        }
        printf("%d\n",count);
    }
    return 0;
}
  這個題目給我的提示是,在進行字符串相關處理時可通過利用字符數目有限、並可和整數進行轉換的特點,將字符處理轉換爲一種整數域的處理,從而方便了問題的解決。
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章