Publications

2024

A Diagrammatic Approach to Improve Computational Efficiency in Group Equivariant Neural Networks

Edward Pearce-Crump and William J. Knottenbelt

Group equivariant neural networks are growing in importance owing to their ability to generalise well in applications where the data has known underlying symmetries. Recent characterisations of a class of these networks that use high-order tensor power spaces as their layers suggest that they have significant potential; however, their implementation remains challenging owing to the prohibitively expensive nature of the computations that are involved. In this work, we present a fast matrix multiplication algorithm for any equivariant weight matrix that maps between tensor power layer spaces in these networks for four groups: the symmetric, orthogonal, special orthogonal, and symplectic groups. We obtain this algorithm by developing a diagrammatic framework based on category theory that enables us to not only express each weight matrix as a linear combination of diagrams but also makes it possible for us to use these diagrams to factor the original computation into a series of steps that are optimal. We show that this algorithm improves the Big-O time complexity exponentially in comparison to a naïve matrix multiplication.

2023

Compact Matrix Quantum Group Equivariant Neural Networks

Edward Pearce-Crump

We derive the existence of a new type of neural network, called a compact matrix quantum group equivariant neural network, that learns from data that has an underlying quantum symmetry. We apply the Woronowicz formulation of Tannaka-Krein duality to characterise the weight matrices that appear in these neural networks for any easy compact matrix quantum group. We show that compact matrix quantum group equivariant neural networks contain, as a subclass, all compact matrix group equivariant neural networks. Moreover, we obtain characterisations of the weight matrices for many compact matrix group equivariant neural networks that have not previously appeared in the machine learning literature.

Graph Automorphism Group Equivariant Neural Networks

Edward Pearce-Crump and William J. Knottenbelt

Accepted at ICML 2024, Vienna, Austria.

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:40051-40077, 2024.

Permutation equivariant neural networks are typically used to learn from data that lives on a graph. However, for any graph $G$ that has $n$ vertices, using the symmetric group $S_n$ as its group of symmetries does not take into account the relations that exist between the vertices. Given that the actual group of symmetries is the automorphism group $\Aut(G)$, we show how to construct neural networks that are equivariant to $\Aut(G)$ by obtaining a full characterisation of the learnable, linear, $\Aut(G)$-equivariant functions between layers that are some tensor power of $\mathbb{R}^{n}$. In particular, we find a spanning set of matrices for these layer functions in the standard basis of $\mathbb{R}^{n}$. This result has important consequences for learning from data whose group of symmetries is a finite group because a theorem by Frucht (1938) showed that any finite group is isomorphic to the automorphism group of a graph. 

An Algorithm for Computing with Brauer's Group Equivariant Neural Network Layers

Edward Pearce-Crump

The learnable, linear neural network layers between tensor power spaces of $\mathbb{R}^{n}$ that are equivariant to the orthogonal group, $O(n)$, the special orthogonal group, $SO(n)$, and the symplectic group, $Sp(n)$, were characterised in  arXiv:2212.08630. We present an algorithm for multiplying a vector by any weight matrix for each of these groups, using category theoretic constructions to implement the procedure. We achieve a significant reduction in computational cost compared with a naive implementation by making use of Kronecker product matrices to perform the multiplication. We show that our approach extends to the symmetric group, $S_n$, recovering the algorithm of arXiv:2303.06208 in the process.

Categorification of Group Equivariant Neural Networks

Edward Pearce-Crump

We present a novel application of category theory for deep learning. We show how category theory can be used to understand and work with the linear layer functions of group equivariant neural networks whose layers are some tensor power space of $\mathbb{R}^{n}$ for the groups $S_n$, $O(n)$, $Sp(n)$, and $SO(n)$. By using category theoretic constructions, we build a richer structure that is not seen in the original formulation of these neural networks, leading to new insights.In particular, we outline the development of an algorithm for quickly computing the result of a vector that is passed through an equivariant, linear layer for each group in question. The success of our approach suggests that category theory could be beneficial for other areas of deep learning.

How Jellyfish Characterise Alternating Group Equivariant Neural Networks

Edward Pearce-Crump

Accepted at ICML 2023, Hawaii, USA.

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:27483-27495, 2023.

PMLR 202:27483-27495 arXiv 2301.10152 Preview

We provide a full characterisation of all of the possible alternating group ($A_n$) equivariant neural networks whose layers are some tensor power of $\mathbb{R}^{n}$. In particular, we find a basis of matrices for the learnable, linear, $A_n$--equivariant layer functions between such tensor power spaces in the standard basis of $\mathbb{R}^{n}$.

2022

Brauer's Group Equivariant Neural Networks

Edward Pearce-Crump

Accepted for a live presentation at ICML 2023, Hawaii, USA.

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:27461-27482, 2023

PMLR 202:27461-27482 arXiv 2212.08630 Preview Live Oral

We provide a full characterisation of all of the possible group equivariant neural networks whose layers are some tensor power of $\mathbb{R}^{n}$ for three symmetry groups that are missing from the machine learning literature: $O(n)$, the orthogonal group; $SO(n)$, the special orthogonal group; and $Sp(n)$, the symplectic group. In particular, we find a spanning set of matrices for the learnable, linear, equivariant layer functions between such tensor power spaces in the standard basis of $\mathbb{R}^{n}$ when the group is $O(n)$ or $SO(n)$, and in the symplectic basis of $\mathbb{R}^{n}$ when the group is $Sp(n)$.

Connecting Permutation Equivariant Neural Networks and Partition Diagrams

Edward Pearce-Crump

Accepted for a live presentation at ECAI 2024, Santiago de Compostela, Spain.

Proceedings of the 27th European Conference on Artificial Intelligence, IOS Press Ebooks, 392:1511-1518, 2024.

392:1511-1518 arXiv: 2212.08648

Permutation equivariant neural networks are often constructed using tensor powers of $\mathbb{R}^{n}$ as their layer spaces. We show that all of the weight matrices that appear in these neural networks can be obtained from Schur-Weyl duality between the symmetric group and the partition algebra. In particular, we adapt Schur-Weyl duality to derive a simple, diagrammatic method for calculating the weight matrices themselves.

A Multigraph Approach for Performing the Quantum Schur Transform

Edward Pearce-Crump

Accepted for a poster presentation at QCTiP 2023, Cambridge, UK.

arXiv: 2204.10694

We take inspiration from the Okounkov--Vershik approach to the representation theory of the symmetric groups to develop a new way of understanding how the Schur--Weyl duality can be used to perform the Quantum Schur Transform. The Quantum Schur Transform is a unitary change of basis transformation between the computational basis of $(\mathbb{C}^d)^{\otimes n}$ and the Schur--Weyl basis of $(\mathbb{C}^d)^{\otimes n}$. We describe a new multigraph, which we call the Schur--Weyl--Young graph, that represents both standard Weyl tableaux and standard Young tableaux in the same diagram. We suggest a major improvement on Louck's formula for calculating the transition amplitudes between two standard Weyl tableaux appearing in adjacent levels of the Schur--Weyl--Young graph for the case $d=2$, merely by looking at the entries in the two tableaux. The key theoretical component that underpins our results is the discovery of a branching rule for the Schur--Weyl states, which we call the Schur--Weyl branching rule. This branching rule allows us to perform the change of basis transformation described above in a straightforward manner for any $n$ and $d$.