CS224n learning part2

  1. how do we represent the meaning of a words?
    Definition: meaning (webster dictionary)
  • the idea that is represented by a word, phrase, etc.
  • and so on.

Commonest linguistic way of thinking of meaning: 最常见的意义思维方式

  • signifier <====> signified (idea or thing) = denotation
  1. main idea of word2vec
    1
    Predict between every word and its context words.
    Two algorithms
  • Skip-grams (SG)
    • Predict context words given target (position independent)
  • Continuous Bag of Words (CBOW)
    • Predict target word from bag of words context

Two (moderately efficient) training methods

  • Hierarchical softmax
  • Negative sampling

Skip-grams Prediction

Details of Word2Vec
Predict surrounding words in a window of radius m of every word.
For p(wt+j|wt) the simple first formulation is

where o is the outside (or output) word index, c is the center word index, vc and uo are “center” and “outside” vectors of indices c and o.
每个词都可以有两个向量(一个也可以 但是两个更简单)

Sentence embedding
Compute sentence similarity using the inner product.

1
2
3
4
5
6
7
S1: Mexico wishes to guarantee citizen's safety.
S2: Mexico wishes to avoid more violence.
Score: 4/5

S1: Iranians Vote in Presidential Election
S2: Keita Wins Mali Presidential Election
Score: 0.4/5

Use as features for sentence classification. 语意感情分析

From Bag-of-words to Complex models

  • Bag-of-words BoW
    1
    v("natural language processing") = 1/3(v("natural") + v("language") + v("processing"))
  • Recurrent neural networks, recursive neural networks, convolutional neural networks…

CS224n learning part1

This is the introduction lecture of NLP (Netural language processing)
NLP和深度学习入门。

What’s deep learning?

1
Deep learning is a subfield of machine learning.

Most machine learning methods work well because of human-designed representations and input features. 大多数机器学习方法都能很好地工作,因为有了人类设计的表示和输入特征。Machine learning becomes just optimizing weights to best make a final prediction.

Representation learning attempts to automatically learning good features or representations 试图自动学习好的特征或表示

Deep learning algorithms attempt to learn (multiple levels of) representation and an output. 深度学习算法试图学习(多级)表示和输出。

Deep NLP = Deep Learning + NLP
combine ideas and goals of NLP with using representation learning and deep learning methods to solve them.

Several big improvements in recent years in NLP with different

  • levels: speech, words, syntax, semantics.
  • tools: parts of speech, entities, parsing.
  • application: machine translation, sentiment analysis, dialogue agents, question answering.

Conclusion: Representation for all levels? Vectors