Terminology Extraction

最近老师让我研究这个，赶紧在网上看了看，好少的资料，哭......

从translated网站看了他们对这项技术的实现，目前只支持english,italian,french，没有中文....

Introduction

Terminology is the sum of the terms which identify a specific topic. Extracting terminology is the process of extracting terminology from a text.

The idea is to compare the frequency of words in a given document with their frequency in the language. Words which appear very frequently in the document but rarely in the language are probably terms.

Technology

It uses Poisson statistics, the Maximum Likelihood Estimation and Inverse Document Frequency (Latent Semantic Analysis) between the frequency of words in a given document and a generic corpus of 100 million words per language. It uses a probabilistic part ff speech tagger to take into account the probability that a particular sequence could be a term. It creates n-grams of words by minimising the relative entropy.

Why have we developed this?

Translated has developed this technology to help its translators to be aware of the difficulties in a document and to simplify the process of creating glossaries.

We also use it to improve search results in traditional search engines (es. Google) by giving a better estimation of how much a keyword is relevant to a document.

Terminology Extraction

Terminology Extraction

Introduction

Technology

Why have we developed this?

米帥真的是gay嗎？

[轉]Personality Traits of the Best Software Developers

07年我的國慶

視覺文化筆記

終於實驗室建成了

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結