Publications
Conference Papers

Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings
Advances in Neural Information Processing Systems (NeurIPS), 2025 (Spotlight)
Many high-dimensional optimisation problems exhibit rich geometric structures in their set of minimisers, often forming smooth manifolds due to over-parametrisation or symmetries. When this structure is known, at least locally, it can be exploited through reduction mappings that reparametrise part of the parameter space to lie on the solution manifold. These reductions naturally arise from inner optimisation problems and effectively remove redundant directions, yielding a lower-dimensional objective. In this work, we introduce a general framework to understand how such reductions influence the optimisation landscape. We show that well-designed reduction mappings improve curvature properties of the objective, leading to better-conditioned problems and theoretically faster convergence for gradient-based methods. Our analysis unifies a range of scenarios where structural information at optimality is leveraged to accelerate convergence, offering a principled explanation for the empirical gains observed in such optimisation algorithms.
@inproceedings{markou2025reductionmappings,
author = {Evan Markou and
Thalaiyasingam Ajanthan and
Stephen Gould},
title = {Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings},
booktitle = {NeurIPS},
year = {2025}}
Towards Scalable Backpropagation-Free Gradient Estimation
Australasian Joint Conference on Artificial Intelligence (AJCAI), 2025 (Best Paper Award)
While backpropagation—reverse-mode automatic differentiation—has been extraordinarily successful in deep learning, it requires two passes (forward and backward) through the neural network and the storage of intermediate activations. Existing gradient estimation methods that instead use forward-mode automatic differentiation struggle to scale beyond small networks due to the high variance of the estimates. Efforts to mitigate this have so far introduced significant bias to the estimates, reducing their utility. We introduce a gradient estimation approach that reduces both bias and variance by manipulating upstream Jacobian matrices when computing guess directions. It shows promising results and has the potential to scale to larger networks, indeed performing better as the network width is increased. Our understanding of this method is facilitated by analyses of bias and variance, and their connection to the low-dimensional structure of neural network gradients.
@inproceedings{wang2025backpropfree,
author = {Daniel Wang and
Evan Markou and
Dylan Campbell},
title = {Towards Scalable Backpropagation-Free Gradient Estimation},
booktitle = {AJCAI},
year = {2025}}
Guiding Neural Collapse: Optimising Towards the Nearest Equiangular Tight Frame
Advances in Neural Information Processing Systems (NeurIPS), 2024
Neural Collapse (NC) is a recently observed phenomenon in neural networks that characterises the solution space of the final classifier layer when trained until zero training loss. Specifically, NC suggests that the final classifier layer converges to a Simplex Equiangular Tight Frame (ETF), which maximally separates the weights corresponding to each class. By duality, the penultimate layer feature means also converge to the same simplex ETF. Since this simple symmetric structure is optimal, our idea is to utilise this property to improve convergence speed. Specifically, we introduce the notion of nearest simplex ETF geometry for the penultimate layer features at any given training iteration, by formulating it as a Riemannian optimisation. Then, at each iteration, the classifier weights are implicitly set to the nearest simplex ETF by solving this inner-optimisation, which is encapsulated within a declarative node to allow backpropagation. Our experiments on synthetic and real-world architectures for classification tasks demonstrate that our approach accelerates convergence and enhances training stability.
@inproceedings{markou2024guidingnc,
author = {Markou, Evan and Ajanthan, Thalaiyasingam and Gould, Stephen},
booktitle = {Advances in Neural Information Processing Systems},
editor = {A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang},
pages = {35544--35573},
publisher = {Curran Associates, Inc.},
title = {Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame},
url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/3eb660055cdcdc9a545a0b16c1eff80d-Paper-Conference.pdf},
volume = {37},
year = {2024}}
ARTGAN: Artwork Restoration using Generative Adversarial Networks
13th International Conference on Advanced Computational Intelligence (ICACI), 2021
We propose a method to recover and restore artwork that has been damaged over time due to several factors. Our method produces great results by completely removing damages in most of the images and perfectly estimating the damaged region. We achieved accurate results due to (i) a custom data augmentation technique which depicts realistic damages rather just blobs (ii) novel CResNetBlocks that subsequently upsample and downsample features to restore the image with efficient backpropagation measures, and (iii) the choice of using patch-discriminators to achieve sharpness and colorfulness. Our network architecture is a conditional Generative Adversarial Network where the generator uses a combination of adversarial loss, L1 loss and the discriminator uses binary crossentropy loss for optimization. While the expressiveness of existing comparison methods is limited, we present our results with several metrics for future comparison and showcase some visuals of recovered artwork
@INPROCEEDINGS{adhikary2021artgan,
author={Adhikary, Abhijit and Bhandari, Namas and Markou, Evan and Sachan, Siddharth},
booktitle={2021 13th International Conference on Advanced Computational Intelligence (ICACI)},
title={ArtGAN: Artwork Restoration using Generative Adversarial Networks},
year={2021},
volume={},
number={},
pages={199-206},
keywords={Measurement;Backpropagation;Visualization;Network architecture;Generative adversarial networks;Generators;Image restoration;Generative Adversarial Network;Image Restoration;Artwork;Neural Network;Image Inpainting},
doi={10.1109/ICACI52617.2021.9435888}}