Importance Sampling - Notes on AI

# Importance Sampling Importance sampling is a mathematical technique used to estimate properties of a particular probability distribution, while only having samples from a different distribution. Suppose you want to estimate the expected value of a function$f(x)$under a probability distribution $p(x)$, but you only have samples from a different distribution $q(x)$. Importance sampling can be used to provide an unbiased estimate of this expected value. Mathematically, the expected value of$f(x)$under $p(x)$ is given by: $ \mathbb{E}_{p}[f(x)] = \int f(x)p(x)dx $ However, you only have samples from $q(x)$. The importance sampling estimator uses a weighted average of the function values at the sampled points, with weights given by the ratio of the probabilities under the two distributions. The estimator is given by: $ \hat{\mathbb{E}}_{p}[f(x)] = \frac{1}{N} \sum_{i=1}^{N} \frac{p(x_i)}{q(x_i)} f(x_i) $ where $x_i$are samples drawn from $q(x)$, and $N$ is the total number of samples. The ratio$\frac{p(x_i)}{q(x_i)}$is known as the importance weight and corrects for the fact that the samples are drawn from $q(x)$rather than $p(x)$. This ratio adjusts the contribution of each sample to reflect how likely it would be under the distribution of interest $p(x)$. An important aspect to consider in importance sampling is the choice of$q(x)$. Ideally, $q(x)$ should be close to $p(x)$and should also have a heavy tail if $f(x)$has a heavy tail. If$q(x)$is very different from $p(x)$, the importance weights can vary greatly, leading to high variance in the estimate. This is a common challenge in implementing importance sampling effectively. --- ## References