Graph Convolutional Networks (GCN)

# Graph Convolutional Networks ## Spatial Graph Convolution Key idea: Unlike [[Spectral graph convolutions]], approach graph convolutions on the spatial domain so convolution is matrix multiplication of local neighborhood. ## Graph Convolutional Networks (GCN) Collect feature vectors from neighboring node, then apply an aggregation function (ex averaging). Pass this aggregated vector through a NN to get the node state. We do this with: 1. Left-multiplying with adjacency, we recover the features in neighborhood 2. Right-multiplying with a weight matrix, we "convolve" on neighborhood 3. Stack multiple convolution layers $ H^{(l+1)}=\sigma\left(\hat{D}^{-1 / 2} \hat{A} \hat{D}^{-1 / 2} H^{(l)} W^{(l)}\right) $ where, $H^{(l)}$ previous features of node $W^{(l)}$ transforms aggregated features into node state $\hat{A}=A+I$ adjacency matrix including identity so each node sends message to itself $\hat{D}$ diagonal matrix with elements representing number of neighbors $i$ has. Pytorch implementation of forward function: ```python def forward(self, node_feats, adj_matrix): """ node_feats: [batch_size, num_nodes, c_in] adj_matrix: [batch_size, num_nodes, num_nodes] """ num_neighbours = adj_matrix.sum(dim=-1, keepdims=True) node_feats = self.linear(node_feats) node_feats = torch.bmm(adj_matrix, node_feats) # batch matrix product node_feats = node_feats / num_neighbours return node_feats ``` ![[gcn_network.png]] [Image Credit](https://tkipf.github.io/graph-convolutional-networks/) The number of GCN layers puts an upper limit on how far a signal can travel within the graph. In traditional neural networks, linear layers apply a linear transformation to the incoming data. This transformation converts input features $x$ into hidden vectors $h$ through the use of a weight matrix $\mathbf{W}$. Ignoring biases for the time being, this can be expressed as: $ h=\mathbf{W} x $ With graph data, an additional layer of complexity is added through the connections between nodes. These connections matter because, typically, in networks, it's assumed that similar nodes are more likely to be linked to each other than dissimilar ones, a phenomenon known as network homophily. We can enrich our node representation by merging its features with those of its neighbors. This operation is called convolution, or neighborhood aggregation. Let's represent the neighborhood of node $i$ including itself as $\tilde{N}$. $ h_i=\sum_{j \in \tilde{\mathcal{N}}_i} \mathbf{W} x_j $ Unlike filters in Convolutional Neural Networks (CNNs), our weight matrix W is unique and shared among every node. But there is another issue: nodes do not have a fixed number of neighbors like pixels do. How do we address cases where one node has only one neighbor, and another has 500 ? If we simply sum the feature vectors, the resulting embedding $h$ would be much larger for the node with 500 neighbors. To ensure a similar range of values for all nodes and comparability between them, we can normalize the result based on the degree of nodes, where degree refers to the number of connections a node has. $ h_i=\frac{1}{\operatorname{deg}(i)} \sum_{j \in \tilde{\mathcal{N}}_i} \mathbf{W} x_j $ The graph convolutional layer has one final improvement. The authors observed that features from nodes with numerous neighbors propagate much more easily than those from more isolated nodes. To offset this effect, they suggested assigning bigger weights to features from nodes with fewer neighbors, thus balancing the influence across all nodes. This operation is written as: $ h_i=\sum_{j \in \tilde{\mathcal{N}}_i} \frac{1}{\sqrt{\operatorname{deg}(i)} \sqrt{\operatorname{deg}(j)}} \mathbf{W} x_j $ Note that when $i$ and $j$ have the same number of neighbors, it is equivalent to our own layer. Now, let's see how to implement it in Python with PyTorch Geometric. --- ## References 1. Tutorial 7, UvA DL 2020 https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial7/GNN_overview.html 2. Graph Attention Networks by Karim Khayat https://www.youtube.com/watch?v=zMIs20GUK_w 3. https://towardsdatascience.com/graph-convolutional-networks-introduction-to-gnns-24b3f60d6c95