# Coreference Resolution
"Niall Ferguson is prolific, well-paid and a snappy dresser. Stephen Moss hated him - at least until he spent an hour being charmed in the historian's Oxford study."
**referent**
- a real world entity that some piece of text (or speech) refers to. In the example above, the actual professor is the referent.
**referring expressions**
- bits of language used to perform reference by a speaker. In the example, "Naill Ferguson", 'he', 'him'.
**antecedent**
- the text initially evoking a referent. "Naill Ferguson"
**anaphora**
- the phenomenon of referring to an antecedent.
**cataphora**
- pronouns appear before the referent (rare) "Since she lost her dog, Kim bought another."
## Pronoun resolution
Identifying the referents of pronouns .
Anaphora resolution: generally only consider cases which refer to antecedent noun phrases.
## Algorithms for coreference resolution
Usually solved as a supervised classification.
- instances: potential pronoun/antecedent pairings
- class is TRUE/FALSE
- training data labelled with correct pairings
- candidate antecedents are all NPs in current sentence and preceeding 5 sentences (excluding pleonastic pronouns)
### Hard constraints: Pronoun agreement
- A little girl is at the door - see what she wants, please?
- My dog has hurt his foot - he is in a lot of pain.
- My dog has hurt his foot - it is in a lot of pain.
Complications:
- I don't know who the new lecturer will be, but I'm sure they'll make changes to the course.
- The team played really well, but now they are all very tired.
- Kim and Sandy are asleep: they are very tired.
### Hard constraints: Reflexives
- John $_{i}$ cut himself $_{i}$ shaving. (himself = John, subscript notation used to indicate this)
$\#$ John $_{i}$ cut $\mathrm{him}_{j}$ shaving. $(\mathrm{i} \neq \mathrm{j}-$ a very odd sentence $)$
Reflexive pronouns must be coreferential with a preceeding argument of the same verb, non-reflexive pronouns cannot be.
### Hard constraints: Pleonastic pronouns
Pleonastic pronouns are semantically empty, and don't refer:
- It is snowing
- It is not easy to think of good examples.
- It is obvious that Kim snores.
- It bothers Sandy that Kim snores.
### Soft preferences: Salience
- Recency: More recent antecedents are preferred. They are more accessible. "Kim has a big car. Sandy has a smaller one. Lee likes to drive it."
- Grammatical role: Subjects > objects > everything else: "Fred went to the shopping centre with Bill. He bought a CD."
- Repeated mention: Entities that have been mentioned more frequently are preferred.
- Parallelism Entities which share the same role as the pronoun in the same sort of sentence are preferred: "Bill went with Fred to the lecture. Kim went with him to the bar." Him=Fred
- Coherence effects: The pronoun resolution may depend on the rhetorical / discourse relation that is inferred. "Bill likes Fred. He has a great sense of humour."
### Features
Cataphoric - Binary: true if pronoun before antecedent.
Number agreement - Binary: true if pronoun compatible with antecedent.
Gender agreement - Binary: true if gender agreement.
Same verb - Binary: true if the pronoun and the candidate antecedent are arguments of the same verb.
Sentence distance - Discrete: {0,1,2,3...}
Grammatical role - Discrete: {subject, object, other} The role of the potential antecedent.
Parallel - Binary: True if the potential antecedent and the pronoun share the same grammatical role.
Linguistic form - Discrete: { proper, definite, indefinite, pronoun }
Niall Ferguson is prolific, well-paid and a snappy dresser. Stephen Moss hated him - at least until he spent an hour being charmed in the historian's Oxford study.
![[Pasted image 20201206141008.png]]
Apply any classifier, e.g. SVM, random forests etc.
### Problems with simple classification model
- Cannot implement 'repeated mention' effect.
- Cannot use information from previous links.
Not really pairwise: need a discourse model with real world entities corresponding to clusters of referring expressions.
End-to-end solution: [[Neural end-to-end coreference resolution]]
---
## References