Meta Learning - Notes on AI

# Meta Learning - A meta-learning model is trained over a variety of learning tasks and optimized for the best performance on a distribution of tasks, including potentially unseen tasks. - In meta-learning, one dataset is considered as one data sample. - Has shown many promising results in [[Computer Vision]], but have started to make its way to [[Natural Language Processing]] ## Approaches Mainly categorized into three approaches: ### Model-based - Makes no assumption on the form of $P_{\theta}(y \mid \mathbf{x})$ - Instead uses model designed specifically for fast learning - Commonly used are RNNs, NTMs ## Metric-based - The main idea is to learn a good metric space to compare new examples to examples already see. - $P_{\theta}(y \mid \mathbf{x})$ is modeled as $\sum_{\left(\mathbf{x}_{i} y_{i}\right) \in S} k_{\theta}\left(\mathbf{x}, \mathbf{x}_{i}\right) y_{i}\left({ }^{*}\right)$ - Common approach is to train a siamese network using gradient descent, and use comparison scheme such as k-NN or k-Means. - Other works are **Matching Networks** ([Vinyals et al., 2016](http://papers.nips.cc/paper/6385-matching-networks-for-one-shot-learning.pdf)), **Relation Network (RN)** ([Sung et al., 2018](http://openaccess.thecvf.com/content_cvpr_2018/papers_backup/Sung_Learning_to_Compare_CVPR_2018_paper.pdf)), **Prototypical Networks** ([Snell, Swersky & Zemel, 2017](http://papers.nips.cc/paper/6996-prototypical-networks-for-few-shot-learning.pdf)) - Works well for few shot classification, but not known how well in regression or RL ## Optimization-based - One network (meta-learner) learns to update another network (the learner) - In LSTM Meta-Learner, LSTM is used because remembers how it previously updated the learner model (think how momentum works) - [[MAML - Model-Agnostic Meta-Learning]] disregards any specific model and is compatible with any model that learns through gradient descent --- ## References 1. Learning to Learn, Chelsea Finnm Jul 2017 https://bair.berkeley.edu/blog/2017/07/18/learning-to-learn/ 2. Meta-Learning: Learning to Learn Fast https://lilianweng.github.io/lil-log/2018/11/30/meta-learning.html