Etaoin Shrdlu

Etaoin Shrdlu
Time Limit: 1000MS   Memory Limit: 65536K
Total Submissions: 1202   Accepted: 636

Description

The relative frequency of characters in natural language texts is very important for cryptography. However, the statistics vary for different languages. Here are the top 9 characters sorted by their relative frequencies for several common languages: 
English: ETAOINSHR

German:  ENIRSATUD

French:  EAISTNRUL

Spanish: EAOSNRILD

Italian: EAIONLRTS

Finnish: AITNESLOK

Just as important as the relative frequencies of single characters are those of pairs of characters, so called digrams. Given several text samples, calculate the digrams with the top relative frequencies.

Input

The input contains several test cases. Each starts with a number n on a separate line, denoting the number of lines of the test case. The input is terminated by n=0. Otherwise, 1<=n<=64, and there follow n lines, each with a maximal length of 80 characters. The concatenation of these n lines, where the end-of-line characters are omitted, gives the text sample you have to examine. The text sample will contain printable ASCII characters only.

Output

For each test case generate 5 lines containing the top 5 digrams together with their absolute and relative frequencies. Output the latter rounded to a precision of 6 decimal places. If two digrams should have the same frequency, sort them in (ASCII) lexicographical order. Output a blank line after each test case.

Sample Input

2
Take a look at this!!
!!siht ta kool a ekaT
5
P=NP
 Authors: A. Cookie, N. D. Fortune, L. Shalom
 Abstract: We give a PTAS algorithm for MaxSAT and apply the PCP-Theorem [3]
 Let F be a set of clauses. The following PTAS algorithm gives an optimal
 assignment for F:
0

Sample Output

 a 3 0.073171
!! 3 0.073171
a  3 0.073171
 t 2 0.048780
oo 2 0.048780

 a 8 0.037209
or 7 0.032558
.  5 0.023256
e  5 0.023256
al 4 0.018605

Source

/**
很簡單的字符串題,用map和struct搞過~~
**/
#include <map>
#include <set>
#include <list>
#include <queue>
#include <stack>
#include <cmath>
#include <ctime>
#include <vector>
#include <bitset>
#include <cstdio>
#include <string>
#include <numeric>
#include <cstring>
#include <cstdlib>
#include <iostream>
#include <algorithm>
#include <functional>
using namespace std;
typedef long long  ll;
typedef unsigned long long ull;

int dx[4]={-1,1,0,0};
int dy[4]={0,0,-1,1};//up down left right
bool inmap(int x,int y,int n,int m){if(x<1||x>n||y<1||y>m)return false;return true;}
int hashmap(int x,int y,int m){return (x-1)*m+y;}

#define eps 1e-8
#define inf 0x7fffffff
#define debug puts("BUG")
#define lson l,m,rt<<1
#define rson m+1,r,rt<<1|1
#define read freopen("in.txt","r",stdin)
#define write freopen("out.txt","w",stdout)
#define maxn (64*80+1)

int n;
struct Node
{
    string s1;
    int c;
}mstr[maxn];
string str,s;
map<string,int>m;
int sum;
int num;

bool cmp(Node a,Node b)
{
    if(a.c==b.c)
        return a.s1<b.s1;
    else
        return a.c>b.c;
}
void deal()
{
    sum=s.length()-1;
    for(int i=0;i<sum;i++)
    {
        char s1[3];
        s1[0]=s[i],s1[1]=s[i+1],s1[2]='\0';
        string s0(s1);
        m[s0]++;
        if(m[s0]==1)
            mstr[num++].s1=s0;
    }
    //cout<<num<<endl;
    for(int i=0;i<num;i++)
        mstr[i].c=m[mstr[i].s1];

    sort(mstr,mstr+num,cmp);

    for(int i=0;i<(5<=num?5:num);i++)
    {
        cout<<mstr[i].s1<<" ";
        printf("%d %.6lf\n",mstr[i].c,mstr[i].c/(sum*1.0));
    }
    printf("\n");
}

int main()
{
    while(~scanf("%d",&n)&&n)
    {
        m.clear();
        s.clear();
        sum=0;
        num=0;
        getchar();
        for(int i=0;i<n;i++)
        {
            getline(cin,str);
            s+=str;
        }
        //cout<<s<<endl;
        deal();
    }
    return 0;
}


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章