# Coreference Resolution "Niall Ferguson is prolific, well-paid and a snappy dresser. Stephen Moss hated him - at least until he spent an hour being charmed in the historian's Oxford study." **referent** - a real world entity that some piece of text (or speech) refers to. In the example above, the actual professor is the referent. **referring expressions** - bits of language used to perform reference by a speaker. In the example, "Naill Ferguson", 'he', 'him'. **antecedent** - the text initially evoking a referent. "Naill Ferguson" **anaphora** - the phenomenon of referring to an antecedent. **cataphora** - pronouns appear before the referent (rare) "Since she lost her dog, Kim bought another." ## Pronoun resolution Identifying the referents of pronouns . Anaphora resolution: generally only consider cases which refer to antecedent noun phrases. ## Algorithms for coreference resolution Usually solved as a supervised classification. - instances: potential pronoun/antecedent pairings - class is TRUE/FALSE - training data labelled with correct pairings - candidate antecedents are all NPs in current sentence and preceeding 5 sentences (excluding pleonastic pronouns) ### Hard constraints: Pronoun agreement - A little girl is at the door - see what she wants, please? - My dog has hurt his foot - he is in a lot of pain. - My dog has hurt his foot - it is in a lot of pain. Complications: - I don't know who the new lecturer will be, but I'm sure they'll make changes to the course. - The team played really well, but now they are all very tired. - Kim and Sandy are asleep: they are very tired. ### Hard constraints: Reflexives - John $_{i}$ cut himself $_{i}$ shaving. (himself = John, subscript notation used to indicate this) $\#$ John $_{i}$ cut $\mathrm{him}_{j}$ shaving. $(\mathrm{i} \neq \mathrm{j}-$ a very odd sentence $)$ Reflexive pronouns must be coreferential with a preceeding argument of the same verb, non-reflexive pronouns cannot be. ### Hard constraints: Pleonastic pronouns Pleonastic pronouns are semantically empty, and don't refer: - It is snowing - It is not easy to think of good examples. - It is obvious that Kim snores. - It bothers Sandy that Kim snores. ### Soft preferences: Salience - Recency: More recent antecedents are preferred. They are more accessible. "Kim has a big car. Sandy has a smaller one. Lee likes to drive it." - Grammatical role: Subjects > objects > everything else: "Fred went to the shopping centre with Bill. He bought a CD." - Repeated mention: Entities that have been mentioned more frequently are preferred. - Parallelism Entities which share the same role as the pronoun in the same sort of sentence are preferred: "Bill went with Fred to the lecture. Kim went with him to the bar." Him=Fred - Coherence effects: The pronoun resolution may depend on the rhetorical / discourse relation that is inferred. "Bill likes Fred. He has a great sense of humour." ### Features Cataphoric - Binary: true if pronoun before antecedent. Number agreement - Binary: true if pronoun compatible with antecedent. Gender agreement - Binary: true if gender agreement. Same verb - Binary: true if the pronoun and the candidate antecedent are arguments of the same verb. Sentence distance - Discrete: {0,1,2,3...} Grammatical role - Discrete: {subject, object, other} The role of the potential antecedent. Parallel - Binary: True if the potential antecedent and the pronoun share the same grammatical role. Linguistic form - Discrete: { proper, definite, indefinite, pronoun } Niall Ferguson is prolific, well-paid and a snappy dresser. Stephen Moss hated him - at least until he spent an hour being charmed in the historian's Oxford study. ![[Pasted image 20201206141008.png]] Apply any classifier, e.g. SVM, random forests etc. ### Problems with simple classification model - Cannot implement 'repeated mention' effect. - Cannot use information from previous links. Not really pairwise: need a discourse model with real world entities corresponding to clusters of referring expressions. End-to-end solution: [[Neural end-to-end coreference resolution]] --- ## References