# Hopfield Networks One of the earliest conceptualization of biological neural networks with feedback architecture. - Feedforward networks have connections that make up for acyclic graphs - Feedback networks are networks that are not feedforward Hopfield networks are: - Fully connected feedback networks - Symmetric weights, no self-connections - Associative (Hebbian) learning No separation of hidden vs visible - Neurons (nodes) update themselves - Based on all other neurons ## Hebbian Learning Inspired by biological neurons. Key idea: Positively correlated neurons reinforce each other's weights $ \frac{d w_{i j}}{d t} \propto \operatorname{correlation}\left(x_{i}, x_{j}\right) $ Associative memories <-> No supervision <-> Pattern completion ![[hebbian.jpg]] ## Hopfield network Binary Hopfield defines neuron states given neuron activation $a$ $ x_{i}=h\left(a_{i}\right)=\left\{\begin{array}{cl} 1 & a_{i} \geq 0 \\ -1 & a_{i}<0 \end{array}\right. $ Continuous Hopfield defines neuron states given neuron activation $a$ $ x_{i}=\tanh \left(a_{i}\right) $ Note the feedback connection! - Neuron $x_{1}$ influences $x_{3}$, but $x_{3}$ influences $x_{1}$ back Who influences whom first? - Synchronous updates: $a_{i}=\sum_{j} w_{i j} x_{j}$ Or - Asynchronous updates: one neuron at a time (fixed or random order) ## Hopfield memory Network updates $x_{i} \in\{-1,1\}$ till convergence to a stable state - Recurrent inference cyctes - Not 'single propagation' like feedforward networks Stable means $x_{i}$ does not flip states anymore. ## Energy function Hopfield networks minimize the quadratic energy function $ f_{\theta}(x)=\sum_{i, j} w_{i j} x_{i} x_{j}+\sum_{i} b_{i} x_{i} $ Hopfield networks are Lyapunov functions meaning: - Decreases under the dynamical evolution of the system - Bounded below Lyapunov functions converge to fixed points. Hopfield energy is a Lyapunov functions: - Provided asynchronous updates - Provided symmetric weights ## Learning algorithm ![[learning-hopfield.jpg]] ## Continuous-time continuous Hopfield network We can replace the state variables with continuous-time variables At time $t$ we compute instantaneous activations $ a_{i}(t)=\sum_{j} w_{i j} x_{j}(t) $ The neuron response is governed by a differential equation $ \frac{d}{d t} x_{i}(t)=-\frac{1}{\tau}\left(x_{i}(t)-h\left(a_{i}\right)\right) $ For steady $a_{i}$ the neuron response goes to stable state. ## Hopfield networks for optimization problems - Optimize function under constraints - The stable states will be the optimal solution - In that case the weights must ensure _valid_ and _optimal_ solutions ![[hopfield-tsp.jpg]] ## Hopfield networks is all you need Ramsauer et al., 2020 https://arxiv.org/abs/2008.02217 - Retrieving from stored memory "patterns" by providing a partial pattern - Retrieval update rule is _equivalent_ to the [[Attention Mechanism#Scaled Dot Product Attention]] mechanism in [[Transformers]] --- ## References 1. Lecture 8.3, UvA DL course 2020