October 24, 2024
12:15 pm – 1:15 pm
Speaker:
Qiang Ye, Department of Mathematics, University of Kentucky
Where:
327 McVey Hall
(Zoom link: https://uky.zoom.us/j/82467171189)
Title:
Recurrent Neural Networks and Transformer for Sequential Data
Abstract:
Many machine learning problems involve sequential data. Recurrent neural networks (RNNs) and Transformer are neural network architectures designed to efficiently model temporal connections within a sequence and handling variable sequence lengths in a dataset. However, RNNs suffer from the so-called vanishing or exploding gradient problems, which also reduces its ability to pass information in a long sequence. Transformer solves this problem through a self-attention mechanism but faces challenges in efficiently scaling to long sequences because the self-attention computation is quadratic with respect to the sequence length. We will present several orthogonal RNN models that we have developed to address the vanishing/exploding gradient problems. We will also present a new efficient Transformer model, called Compact Recurrent Transformer (CRT), which combine a shallow Transformer to process short local segments with an orthogonal RNN to compress and manage a long-range global information. We will demonstrate the effectiveness of our models through two applications: language models and effective molecular representations for some molecular-related inference tasks.