# Neural Autoregressive Density Estimation Inspired by [[Boltzmann Machines#Restricted Boltzmann Machines]] but with tractable density estimation Each conditional modelled with sigmoidal neural net like in RBMs. Parameter matrix $W$ maps past inputs $v_{<i}$ to hidden feature $\boldsymbol{h}_{i}$ Parameter matrix $V$ generates pixel $v_{i}$ given the hidden feature $\boldsymbol{h}_{i}$ Map past input to a hidden state, only take into account past inputs $ \boldsymbol{h}_{i}=\sigma\left(\boldsymbol{c}+W_{:<i} \boldsymbol{v}_{<i}\right) $ Sample future ouput given a hidden state $ p\left(v_{i} \mid v_{<i}\right)=\sigma\left(b_{i}+\left(V^{T}\right)_{i,} \boldsymbol{h}_{i}\right) $ Uses teacher forcing for training - During training use ground truth past inputs $v_{<i}$ - During testing use predicted past inputs $\hat{v}_{<i}$ --- ## References