MATH Seminar
Title: Structure-conforming Operator Learning via Transformers |
---|
Seminar: Numerical Analysis and Scientific Computing |
Speaker: Shuhao Cao of University of Missouri-Kansas City |
Contact: Yuanzhe Xi, yuanzhe.xi@emory.edu |
Date: 2024-03-21 at 10:00AM |
Venue: MSC W201 |
Download Flyer |
Abstract: GPT, Stable Diffusion, AlphaFold 2, etc., all these state-of-the-art deep learning models use a neural architecture called "Transformer". Since the emergence of "Attention Is All You Need" paper by Google, Transformer is now the ubiquitous architecture in deep learning. At Transformer's heart and soul is the "attention mechanism". In this talk, we shall dissect the "attention mechanism" through the lens of traditional numerical methods, such as Galerkin methods, and hierarchical matrix decomposition. We will report some numerical results on designing attention-based neural networks according to the structure of a problem in traditional scientific computing, such as inverse problems for Neumann-to-Dirichlet operator (EIT) or multiscale elliptic problems. Progresses within different communities will be briefed to answer some open problems on the mathematical properties of the attention mechanism in Transformers. |
See All Seminars