字典樹解決句子相似性問題

【問題描述】給定一個段落,由 N 個句子組成。第 i 個句子的長度爲 L[i],包含的單詞個數爲 W[i]。
句子不包含任何除字母和空格( ) 外的符號。
每個句子內部,含有若干個單詞,由空格( ) 分隔。句子不會包含連續的空格。
隨後給定 M 個查詢,每個查詢包含一個句子,需要在段落中尋找相同單詞數量最多的句子。重複的單詞只計一次,且不區分大小寫。
輸入數據將保證結果是存在且唯一的。
輸入格式
第一行是兩個整數 N 和 M。
接下來的 N+M 行,每行包含一個句子。
前 N 行代表段落中的句子,後 M 行表示查詢。
輸出格式
輸出 M 行,每行代表查詢的結果。
輸入樣例
6 3
An algorithm is an effective method that can be expressed within a finite amount of space and time
Starting from an initial state and initial input the instructions describe a computation
That when executed proceeds through a finite number of successive states
Eventually producing output and terminating at a final ending state
The transition from one state to the next is not necessarily deterministic
Some algorithms known as randomized algorithms incorporate random input
Next to the transition
Wormhole, infinite time and space
The transition from one state to the next is not necessarily deterministic
輸出樣例
The transition from one state to the next is not necessarily deterministic
An algorithm is an effective method that can be expressed within a finite amount of space and time
The transition from one state to the next is not necessarily deterministic

package com.kai.util;

import java.util.Arrays;
import java.util.HashSet;
import java.util.Scanner;

/**
 * Created by Administrator on 2017/8/4.
 */
class MyRead{
    public static String[] s1;
    public static String []s2;
    public static void read(){
        Scanner sc=new Scanner(System.in);
        int n=0;
        int m=0;
        String s="";
        n=sc.nextInt();
        m=sc.nextInt();
        sc.nextLine();
//        s=sc.nextLine();
//        System.out.println(s);
//        String [] ss=s.split(" ");

//        n=Integer.parseInt(ss[0]);
//        m=Integer.parseInt(ss[1]);
        s1=new String[n];
        s2=new String[m];
            for(int i=0;i<n;i++){

                s1[i]=sc.nextLine().toLowerCase();
            }
            for(int i=0;i<m;i++){

                s2[i]=sc.nextLine().toLowerCase();
            }
            for(int i=0;i<n;i++){
                System.out.println(s1[i]+"i="+i);
            }
        System.out.println("hello");
        for(int i=0;i<m;i++){
            System.out.println(s2[i]);
        }
        }

}
public class TrieTree {
    Node root=new Node();

    private  class Node{
        private Node[] child=new Node[26];
        private  int count;
        private HashSet<String> set=new HashSet<String>();
    }
    public  void addTrieNode(String s){
        Node current=root;

        for(int index=0;index<s.length();index++){
             char c=s.charAt(index);
            if(current.child[c-'a']==null){
              Node node =new Node();
              current.child[c-'a']=node;
            }
            current=current.child[c-'a'];
             if(index==s.length()-1){
              current.count ++;
             }
       // current.set.add(s);
         }
    }
    public  int findTrie(String s){
        Node current=root;
       for(int index=0;index<s.length();index++){

            char c=s.charAt(index);
            if(current.child[c-'a']==null){
                return 0;
            }
            current=current.child[c-'a'];
         }
        return current.count;
    }


    public static void main(String[] args) {
            MyRead.read();
            int [] res=new int[MyRead.s2.length];
            int[][] r=new int [MyRead.s1.length][MyRead.s2.length];
            for(int i=0;i<MyRead.s1.length;i++){
                TrieTree trieTree=new TrieTree();
                String[] str=MyRead.s1[i].split(" ");
                for(int j=0;j<str.length;j++){
                    trieTree.addTrieNode(str[j]);
                }
                for(int j=0;j<MyRead.s2.length;j++){
                    int count =0;
                    String[] str2=MyRead.s2[j].split(" ");
                    for(int k=0;k<str2.length;k++){
                        if(trieTree.findTrie(str2[k])>0){
                            count++;
                        }
                    }
                    r[i][j]=count;
                }

            }
            for(int j=0;j<MyRead.s2.length;j++){
                int max=0;
                int index=0;
                for(int i=0;i<MyRead.s1.length;i++){
                    if(r[i][j]>max){
                        max=r[i][j];
                        index=i;
                    }
                }
                res[j]=index;
            }
        System.out.println(Arrays.toString(res));
    }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章