《Learning Structured Representation for Text Classification via Reinforcement Learning》閱讀筆記

原創

2019-07-06 02:11

一、Abstract

Unlike most existing representation models that either use no structure or rely on pre-specified structures, we propose a reinforcement learning (RL) method to learn sentence representation by discovering optimized structures automatically.

二、Introduction

Mainstream models
1、Bag-of-words representation models
2、Sequence representation models
3、Structured representation models (tree-structured LSTM)
4、Attention-based methods
propose two structured representation models:
1、information distilled LSTM (ID-LSTM)
2、hierarchical structured LSTM (HS-LSTM）

三、Methodology

1、Overview

Policy Network:
- Samples an action at each state
- Two models: Information Distilled LSTM, Hierarchically Structured LSTM

Structured Representation Model: transfer action sequence to representation

Classification Network: provide reward signals

2、ID-LSTM

Target: Distill the most important words and remove irrelevant words

Action: {Retain, Delete}

Policy: $\pi (a_t |s_t;\theta ) = \sigma (W * s_t + b)$

States:

Rewards: (delayed reward)

NB : L’ denotes the number of deleted words

Objective Function: REINFORCE algorithm and policy gradient methods

3、HS-LSTM

Target: Build a structured representation by discovering hierarchical structures in a sentence

Action: {Inside, End}
Indicating that a word is inside or at the end of a phrase (phrase means a substructure or segment)

Policy: $\pi (a_t |s_t;\theta ) = \sigma (W * s_t + b)$

States:

Reward:(delayed reward)

NB : A unimodal function of the number of phrases (a good phrase structure should contain neither too many nor too few phrases)

Objective Function: REINFORCE algorithm and policy gradient methods