Control variates - Notes on AI

# Control variates If stochastic gradients have too high variance they are not usable. To reduce variance we can apply variance reduction techniques. The most popular method is to use control variates. We want to reduce the variance in estimating $f(x)$ . Assume we have a related function $h(x)$ - For which we know analytically its expectation $\bar{h}=\mathbb{E}_{p_{\varphi}(x)}[h(x)]$ - For instance, $h(x)$ can be the second order [[Taylor Expansion]] of $f(x)$ Instead of estimating $ \mathbb{E}_{p_{\varphi}(x)}[f(x)] \approx \hat{f}=\frac{1}{n} \sum_{i} f\left(x^{(i)}\right), x^{(i)} \sim p_{\varphi}(x) $ Substract the baseline and add the analytical expectation $ \tilde{f}=\mathbb{E}_{p_{\varphi}(x)}[f(x)-\beta h(x)]+\beta \bar{h} $ In the limit the expectation will be the same as before, $\mathbb{E}[\tilde{f}]=\mathbb{E}[\hat{f}]$ However, the variance is lower and reduction optimal for $\beta=\frac{\operatorname{cov}(f, \mathrm{~h})}{\operatorname{Var}(\mathrm{h})}$. --- ## References