# ResNet - The first truly Deep Network, going deeper than 1,000 layers - The first deep architecture to gracefully go deeper than a few dozen layers - Scalable not because of reasons like simply getting more GPUs, more training time, adding classifiers, etc - Smashed Imagenet, with a ~3% error (with ensembles) - Won all object classification, detection, segmentation, etc. challenges ## Hypothesis Hypothesis: Can we have a very deep network at least as accurate as averagely deep networks? Thought experiment: Let's assume two almost identical convnets A, B - B is the same as A just with extra "identity" layers - Identity layers pass information unchanged -> their errors should be similar - Thus, there is at least one Convnet B as good as A w.r.t. training error ### Testing the hypothesis - Training a shallow and a deeper architecture - The deeper model does worse in training error! - Performance degradation not by overfitting -> just harder optimization! - Assuming optimizers are doing their job fine, not all networks are the same as easy to optimize ## Residual connections Add to your module output $F(x)$ the input $x$ so that $H(x)=F(x)+x$. If dimensions don't match zero padding or a projection layer. As simple as that! WIth this resudual connection, deeper networks have no degrations. This enabled: - Ridiculously low error in ImageNet - Up to 1000 layers ResNets trained - Previous deepest network ~ 30-40 layers on simple datasets ### Insights - [[Normalization#Batch normalization]] absolutely necessary because of vanishing gradients - Identity shortcuts cheaper and almost equal to project shortcuts - Networks with skip connections converge faster compared to the same network without skip connections - Generally, skip/residual connections are an asset for deeper architectures ## HighwayNets Similar to ResNets, but with a learnable gate per skip connection $ \bar{y}=H\left(x, W_{H}\right) \cdot T\left(x, W_{T}\right)+x \cdot\left(1-T\left(x, W_{T}\right)\right) $ --- ## References 1. Lecture 5.4, UvA DL course 2020