Yury Polyanskiy
Professor of Electrical Engineering and Computer Science, MIT
Fri, Mar 8, 2024
5:00 PM UTC
Fri, Mar 8, 2024
5:00 PM UTC
In-person
4 Thomas More St
London E1W 1YW, UK
London E1W 1YW, UK
The Roux Institute
Room
100 Fore Street
Portland, ME 04101
Portland, ME 04101
Network Science Institute
2nd floor
2nd floor
Network Science Institute
11th floor
11th floor
177 Huntington Ave
Boston, MA 02115
Boston, MA 02115
Room
58 St Katharine's Way
London E1W 1LP, UK
London E1W 1LP, UK
Talk recording
Almost all of the recent advances in AI (large language models, image/video diffusion models, recommendation systems, and visuomotor policy in robotics) are based around a neural architecture known as transformer. We identify the process of propagation of representations through layers of transformer with a particular kind of an interacting particle system, whose dynamics exhibit several interesting properties. In the initial (short) phase, the particles coalesce to form medium-sized clusters. In the second (long) phase, these clusters slowly spin around and occasionally merge until they form a single lump. We hypothesize that the first phase's (meta-stable) clustering may explain transformers' ability for robust long-horizon logical deductions. The second phase corresponds to synchronization behavior discovered in certain dynamics, such as the Kuramoto model, which turns out to be a special case of the transformer.
About the speaker
Yury Polyanskiy is a Professor of Electrical Engineering and Computer Science, a member of IDSS and LIDS at MIT, and an IEEE Fellow. Yury received his M.S. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology, Moscow, Russia, in 2005, and his Ph.D. degree in electrical engineering from Princeton University, Princeton, NJ, in 2010. His research interests span across information theory, statistical learning, error-correcting codes, wireless communication, and fault tolerance. Dr. Polyanskiy won the 2020 IEEE Information Theory Society James Massey Award, the 2013 NSF CAREER Award, and the 2011 IEEE Information Theory Society Paper Award.
Share this page: