Abstract: This workshop tutorial will first discuss the motivation and structure of the transformer, arguably the most popular machine learning architecture in current literature and industry practices. Namely, the attention mechanism, which is the basis of the transformer, and how it satisfies several desirable model properties (e.g., permutation equivariance, size independence, global all-to-all attention) will be explained. Next, the main categories of the transformer (encoder-decoder, encoder-only, and decoder-only) and how they serve different applications will be addressed. Finally, a toy example of molecular generation (in text representation, e.g., SMILES/SELFIES) will be demonstrated using the transformer. This tutorial in intended for beginners or chemists who wish to apply transformers to molecular data.
Speaker Biography: Tim Hsu is a staff data scientist at Lawrence Livermore National Laboratory (LLNL) working on generative models for molecules and materials. He joined LLNL in 2020 as a postdoctoral researcher and studied graph neural networks for atomic structures. Prior to that, he also worked on synthesizing 3D microstructure microscopy images with a convolution neural network generative model. Tim received a PhD in Materials Science and Engineering in 2019 at Carnegie Mellon University.