# ResNet
- The first truly Deep Network, going deeper than 1,000 layers
- The first deep architecture to gracefully go deeper than a few dozen layers
- Scalable not because of reasons like simply getting more GPUs, more training time, adding classifiers, etc
- Smashed Imagenet, with a ~3% error (with ensembles)
- Won all object classification, detection, segmentation, etc. challenges
## Hypothesis
Hypothesis: Can we have a very deep network at least as accurate as averagely deep networks?
Thought experiment: Let's assume two almost identical convnets A, B
- B is the same as A just with extra "identity" layers
- Identity layers pass information unchanged -> their errors should be similar
- Thus, there is at least one Convnet B as good as A w.r.t. training error
### Testing the hypothesis
- Training a shallow and a deeper architecture
- The deeper model does worse in training error!
- Performance degradation not by overfitting -> just harder optimization!
- Assuming optimizers are doing their job fine, not all networks are the same as easy to optimize
## Residual connections
Add to your module output $F(x)$ the input $x$ so that $H(x)=F(x)+x$. If dimensions don't match zero padding or a projection layer. As simple as that!
WIth this resudual connection, deeper networks have no degrations. This enabled:
- Ridiculously low error in ImageNet
- Up to 1000 layers ResNets trained
- Previous deepest network ~ 30-40 layers on simple datasets
### Insights
- [[Normalization#Batch normalization]] absolutely necessary because of vanishing gradients
- Identity shortcuts cheaper and almost equal to project shortcuts
- Networks with skip connections converge faster compared to the same network without skip connections
- Generally, skip/residual connections are an asset for deeper architectures
## HighwayNets
Similar to ResNets, but with a learnable gate per skip connection
$
\bar{y}=H\left(x, W_{H}\right) \cdot T\left(x, W_{T}\right)+x \cdot\left(1-T\left(x, W_{T}\right)\right)
$
---
## References
1. Lecture 5.4, UvA DL course 2020