# Word Sense Disambiguation
Word meaning is needed for many application. It relies on context, for exmaple: striped bass (fish) vs bass guitar.
Three methods are:
1. Supervised learning
- Assume a predefined set of word senses, e.g. WordNet
- Need a large sense-tagged training corpus (difficult to construct)
- Don't really learn generalization
2. Semi-supervised learning (Yarowsky, 1995)
- bootsrap from a few examples
3. Unsupervised sense induction
- We don't assume pre-annotated sense, we learn the patterns of contextual use from clustering the contexts in which a word occurs
## WSD by semi-supervised learning
An old approach based on Yarowsky, David (1995) Unsupervised word sense disambiguation rivalling supervised methods.
Example: Disambiguating plant (factory/vegetation)
1. Find context in training corpus i.e examples with word 'plant'
2. Identify some seeds to disambiguate a few uses, example: 'plant life' vs 'manufacturing plant' (by human researcher)
3. Train a decision list classifier on Sense A/Sense B examples. Rank features by log-likelihood ratio: $\log \left(\frac{P\left(\text { Sense }_{A} \mid f_{i}\right)}{P\left(\text { Sense }_{B} \mid f_{i}\right)}\right)$
4. Apply the classifier to the training set and add reliable examples to A and B sets.
5. Iterate the previous steps 3 and 4 until convergence
Yarowsky reported accuracy of 95%, but the experiments were nearly all on homonyms: these principles may not hold as well for sense extension.
## Problems with WSD as supervised classification
1. Real performance around $75 \%$ (supervised)
2. Need to predefine word senses (not theoretically sound)
3. Need a very large training corpus (difficult to annotate, humans do not agree)
4. Learn a model for individual words - no real generalisation
Better approach is to use unsupervised sense induction, which is a very hard task.
---
## References
1. Chapter 19, Jurafsky and Martin, 2019