English syntax (part two)

Context-free grammars (CFGs)

A simple model of syntax allows us to describe some of syntactic structures of English, and it is also frequently used for describing the syntax of programming languages, a.k.a Back-Naur form.

這可以這樣理解,上下文無關意味着句子語法結構符合語法即可,可以拋去語境所帶來的影響。

eg. I run lake.

  • Informal description of the definition
  1. Terminals, English words.
  2. Nonterminal, categories of constituents.
  3. Rules, including how to put certain constituents together to from bigger constituents.
  4. Start symbol, S is special nonterminal that represent complete string, sentence.
  • Examples CFG
  1. S ⇒ NP VP : I want a morning flight.
  2. NP ⇒ Pronoun | ProperNoun | Det Nominal : I | Boston | a flight
  3. Nominal ⇒ Nominal Noun | Noun : morning flight | flights
  4. VP ⇒ Verb | Verb NP | VP PP : want a flight | leave Boston at night | leaving on Thursday
  5. PP ⇒ Preposition NP : from Boston
  6. Pronoun ⇒ I | you | he
  7. ProperNoun ⇒ Boston | Paris
  8. Det ⇒ the | a | an
  • The arrow ⇒ can be read as “can be expanded as”, or “the left-hand side (LHS) can be rewrite to the right-hand side (RHS)”.

  • The vertical bar “|” separates alternatives.

  • The recursion in the grammar

  1. Part of a Nominal can itself be a Nominal, and this is a form of direct recursion.
  2. Recursion can also be indirect.
    eg. In an extended CFG, a VP could be part of a NP, while a NP could be a part of a VP.
  • Each RHS consists of a single terminal (i.e. word)

Some sentence types in English

  • Declaratives: A plane left : S ⇒ NP VP
  • Imperatives : Leave! : S ⇒ VP
  • Yes/No questions : Dis the plane leave? : S ⇒ Aux NP VP
  • Wh subject questions : Which flights serve breakfast? : S ⇒ Wh-NP VP
  • Wh non-subject questions : Which flight did you book? : S ⇒ Wh-NP Aux NP VP

Meaning (applications)of CFGs

  • Generating strings
  • Accepting/rejecting strings
  • Assign structure to accepted strings
    The third is called parsing : taking a string (and a grammar) and computing the structure of the string according to the grammar. This structure is called the parse tree or parse.

Derivations

A derivation is a sequence of rules (starting with start symbol, normally named S) used to derive a sentence (string of terminal)
eg.
S ⇒ NP VP ⇒ Pronoun VP ⇒ I VP ⇒ I Verb NP ⇒
I prefer NP ⇒
I prefer Det Nominal ⇒
I prefer a Nominal ⇒
I prefer a Nominal Noun ⇒ I prefer a Noun Noun ⇒
I prefer a morning Noun ⇒ I prefer a morning flight

Grammaticality

  • If a sentence has a least one derivation, it is said to be grammatical.
  • A set of sentences that can be derived by a given CFG is called a context-free language.
  • English and other natural language is too intricate, and there is no CFG that generates all and only English sentences. But as far as English (or any natural language) can be described by formal grammars, it seems to be roughly context-free.

Spoken language

  • It is difficult to capture informal spoken language, due to speech disfluencies, i.e. phenomena such as repairs, use of fillers (eg. uh) etc.
    eg. He was wearing a black — uh, I mean a blue, a blue shirt.
  • We regard such issues as mostly separate from the syntax of written language.

Left-most derivations

  • There may be several ways to derive the same thing, but we can avoid this by demanding that rewriting should always be left-most.
    eg. NP VP ⇒ Pronoun VP ⇒ Pronoun Verb NP
  • When a derivation is left-most, we write ⇒LM
  • A left-most derivation of input w using rules d = π1 · · · πm , where πi = (Ai → αi ), we will denote as S ⇒dLM w

Parse trees

Parses and left-most derivation are very related concepts, and sometimes they are synonymous.
在這裏插入圖片描述

Bracketed notation

  • Bracketed notation is a linear notation to indicate that certain sequences are categories of certain class.
    eg. 在這裏插入圖片描述

References

The NLP slides of University of St Andrews

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章