Sequence Model - Natural Language Processing & Word Embeddings

Word Embeddings

Word Representation

  • 1-hot representation: any product of them is \(0\)
  • Featurized representation: word embedding

Visualizing word embeddings

visualize

t-SNE algorithm: \(300 \mathrm D \to 2 \mathrm D\)

learn the concepts that fell like they should be more related

Using word embeddings

Named entity recognition example

name_entity

it will be much smaller in training sets and so this allows you to carry out transfer learning

Transfer learning and word embeddings

  • Learn word embeddings from large text corputs. (\(1 - 100\mathrm B\) words)

    (or download pre-trained embedding online.)

  • Transfer embedding to new task with smaller training set.

    (say, 100k words)

  • Optional: Continue to finetune word embeddings with new data

Properties of Word Embeddings

Analogies

\(\text{Man} \to \text{Woman } as \text{ King} \to ?\)

\(e_{\text{man}} - e_{\text{woman}} \approx \begin{bmatrix} -2 \\ 0 \\ 0 \\ 0 \end{bmatrix} \approx e_{\text{king}} - e_{\text{queen}}\)

\(e_? \approx e_\text{king} - e_\text{man} + e_\text{woman} \approx e_{\text{queen}}\)

find a word \(w\) to satisfiy \(\max_w \text{sim}(e_w, e_\text{king} - e_\text{man} + e_\text{woman})\)

  • Cosine similarity

    \[\text{sim}(u, v) = \frac{u^{T}v}{||u||_2 ||v||_2} \]

Embedding Matrix

embedding_matrix
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章