Masked Autoencoder for Distribution Estimation

# Masked Autoencoder for Distribution Estimation - Make an autoregressive autoencoder by setting each output $x_{i}$ depend only on previous outputs $x_{<i}$ - In autoencoders the output dimensions depend on 'future' dimensions also - Implement this by introducing a masking matrix $M$ to multiply weights $ \begin{array}{l} h(x)=g\left(b+\left(W \odot M^{W}\right) \cdot x\right) \\ \hat{x}=\sigma\left(c+\left(V \odot M^{V}\right) \cdot h(x)\right) \end{array} $ For the k-th neuron the mask column is $M_{k, d}=\left\{\begin{array}{ll}1 & m(k) \geq d \\ 0 & \text { otherwise }\end{array}\right.$ And $m(k)$ is a integer between 1 and $d-1$ ![[MADE.jpg]] --- ## References