# Masked Autoencoder for Distribution Estimation
- Make an autoregressive autoencoder by setting each output $x_{i}$ depend only on previous outputs $x_{<i}$
- In autoencoders the output dimensions depend on 'future' dimensions also
- Implement this by introducing a masking matrix $M$ to multiply weights
$
\begin{array}{l}
h(x)=g\left(b+\left(W \odot M^{W}\right) \cdot x\right) \\
\hat{x}=\sigma\left(c+\left(V \odot M^{V}\right) \cdot h(x)\right)
\end{array}
$
For the k-th neuron the mask column is $M_{k, d}=\left\{\begin{array}{ll}1 & m(k) \geq d \\ 0 & \text { otherwise }\end{array}\right.$
And $m(k)$ is a integer between 1 and $d-1$
![[MADE.jpg]]
---
## References