Exploring optimization algorithms for recurrent neural networks (available)

Starting Date: June 2024
Prerequisites: Having good knowledge of Python programming is essential. Familiarity with Deep Learning and some deep learning framework (such as PyTorch/JAX/Tensorflow) is important.
Will results be assigned to University: No

Recurrent neural networks (RNNs) are a key type of architecture in modern deep learning, particularly for processing sequential data such as text, speech, video, and time series data. Unlike feedforward networks, RNNs have loops that allow information to persist and be passed from one step to the next. This enables them to effectively model patterns and dependencies in sequences, making RNNs crucial for tasks like language modeling, machine translation, speech recognition, and time series forecasting. While the transformer architecture dominates for most large scale sequence tasks, RNNs are catching up in terms of performance [1] in natural language processing, audio processing, and other sequence learning applications while providing an efficient alternative to transformers.

In this project, the goal will be to study optimization algorithms for recurrent neural networks beyond the ubiquitous backpropagation through time (BPTT). Specifically, various aspects optimization algorithms such as evolutionary algorithms [2], second-order gradient methods [3] or biologically plausible training methods [4] will be applied to modern RNNs.

Students are welcome to email (anand.subramoney@rhul.ac.uk) for informal discussions.

Reading:

  1. Gu, A., Dao, T., 2023. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. https://doi.org/10.48550/arXiv.2312.00752
  2. Beyer, H.-G., Schwefel, H.-P., 2002. Evolution strategies – A comprehensive introduction. Natural Computing 1, 3–52. https://doi.org/10.1023/A:1015059928466
  3. Anil, R., Gupta, V., Koren, T., Regan, K., Singer, Y., 2021. Scalable Second Order Optimization for Deep Learning. https://doi.org/10.48550/arXiv.2002.09018
  4. Bellec, G., Scherr, F., Subramoney, A., Hajek, E., Salaj, D., Legenstein, R., Maass, W., 2020. A solution to the learning dilemma for recurrent networks of spiking neurons. Nature Communications 11, 3625. https://doi.org/10.1038/s41467-020-17236-y