Latenent Variable Models

# Latent Variable Models Latent variables are unobserved targets/values that make it easier to understand the data. Two main reasons why we use latent variable models 1. Some data might be naturally unobserved, and it helps learning with missing/unaccessible data. 2. We want to model the inverse process, figure out what these unobserved "latents" are. 3. More importantly, they enable us to leverage our prior knowledge when defining a model. Popular latent variable model are [[Gaussian Mixture Model]], and [[Variational Autoencoders]]. Usually learned with approximate learning algorithms such as [[Expectation Maximization]]. ## Advantages 1. By making convenient choices for latents, we can model much more complex $x$. 2. Without latents we can have exploding number of parameters for ex: regular [[Boltzmann Machines]] (not RBMs). ## Distributed representations Latent variable models are closely related to the hypothetical notion of distributed representations: - Distribute the 'representation' of our data over multiple neurons - And each neuron models a distribution of concepts (colors, shapes etc.) - The latent layer learns to encode combinations of patterns for efficiency --- ## References 1. Stanford CS228 notes on latent variable models https://ermongroup.github.io/cs228-notes/learning/latent/ 2. Lecture 9.1, UvA DL course 2020