最大熵模型文獻閱讀指南

原創

2020-06-30 21:46

最大熵模型（Maximum Entropy Model）是一種機器學習方法，在自然語言處理的許多領域（如詞性標註、中文分詞、句子邊界識別、淺層句法分析及文本分類等）都有比較好的應用效果。張樂博士的最大熵模型工具包manual裏有“Further Reading”，寫得不錯，就放到這裏作爲最大熵模型文獻閱讀指南了。
　　與《統計機器翻譯文獻閱讀指南》不同，由於自己也正在努力學習Maximum Entropy Model中，沒啥發言權，就不多說廢話了。這些文獻在Google上很容易找到，不過多數都比較長（30多頁），甚至有兩篇是博士論文，有100多頁，希望初學讀者不要被嚇住了，畢竟經典的東西是值得反覆推敲的！

Maximum Entropy Model Tutorial Reading

　　This section lists some recommended papers for your further reference.

1. Maximum Entropy Approach to Natural Language Processing [Berger et al., 1996]
　　（必讀）A must read paper on applying maxent technique to Natural Language Processing. This paper describes maxent in detail and presents an Increment Feature Selection algorithm for increasingly construct a maxent model as well as several example in statistical Machine Translation.

2.Inducing Features of Random Fields [Della Pietra et al., 1997]
　　（必讀）Another must read paper on maxent. It deals with a more general frame work: Random Fields and proposes an Improved Iterative Scaling algorithm for estimating parameters of Random Fields. This paper gives theoretical background to Random Fields (and hence Maxent model). A greedy Field Induction method is presented to automatically construct a detail random elds from a set of atomic features. An word morphology application for English is developed.

3.Adaptive Statistical Language Modeling: A Maximum Entropy Approach [Rosenfeld, 1996]
　　This paper applied ME technique to statistical language modeling task. More specically, it built a conditional Maximum Entropy model that incorporated traditional N-gram, distant N-gram and trigger pair features. Significantly perplexity reduction over baseline trigram model was reported. Later, Rosenfeld and his group proposed a Whole Sentence Exponential Model that overcome the computation bottleneck of conditional ME model.

4.Maximum Entropy Models For Natural Language Ambiguity Resolution [Ratnaparkhi, 1998]
　　This dissertation discussed the application of maxent model to various Natural Language Disambiguity tasks in detail. Several problems were attacked within the ME framework: sentence boundary detection, part-of-speech tagging, shallow parsing and text categorization. Comparison with other machine learning technique (Naive Bayes, Transform Based Learning, Decision Tree etc.) are given.

5.The Improved Iterative Scaling Algorithm: A Gentle Introduction [Berger, 1997]
　　This paper describes IIS algorithm in detail. The description is easier to understand than [Della Pietra et al., 1997], which involves more mathematical notations.

6.Stochastic Attribute-Value Grammars (Abney, 1997)
　　Abney applied Improved Iterative Scaling algorithm to parameters estimation of Attribute-Value grammars, which can not be corrected calculated by ERF method (though it works on PCFG). Random Fields is the model of choice here with a general Metropolis-Hasting Sampling on calculating feature expectation under newly constructed model.

7.A comparison of algorithms for maximum entropy parameter estimation [Malouf, 2003]
　　Four iterative parameter estimation algorithms were compared on several NLP tasks. L-BFGS was observed to be the most effective parameter estimation method for Maximum Entropy model, much better than IIS and GIS. [Wallach, 2002] reported similar results on parameter estimation of Conditional Random Fields.

附錄：
張樂博士的最大熵模型工具包：
　http://homepages.inf.ed.ac.uk/lzhang10/maxent_toolkit.html
關於最大熵模型的兩個參考網頁，後者也是一個reading list，但是較早：
　1.MaxEnt and Exponential Models
　2.A maxent reading list

注：轉載請註明出處“我愛自然語言處理”：www.52nlp.cn

本文鏈接地址：http://www.52nlp.cn/maximum-entropy-model-tutorial-reading

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

最大熵模型文獻閱讀指南

自動分詞算法的分類

ME, HMM, MEMM, CRF

一個基於搜索的中文分詞方法( A Search-based Chinese Word Segmentation Method)

最大熵模型文獻閱讀指南

Mongodb源碼分析--插入記錄及索引B樹構建

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結