BNL Physics Colloquia

Why is AI hard and Physics simple (Dan Roberts, MIT & Salesforce)

US/Eastern
Online only.

Online only.

Description
Join ZoomGov Meeting
https://bnl.zoomgov.com/j/1605020278?pwd=cHJ1bDRuK1FDNnZLSnpxVkZhcDQ3QT09

Meeting ID: 160 502 0278
Passcode: E=mc2

 

Abstract:

Deep learning is an exciting approach to modern artificial intelligence based on artificial neural networks. The goal of this talk is to put forth a set of principles that enable us to theoretically analyze deep neural networks of actual relevance.  In doing so, we will explain why such a goal is even attainable in theory and how we are able to get there in practice. 

To begin, we will discuss how physical intuition and the approach of theoretical physics can be brought to bear on this problem, borrowing from the "effective theory" framework of physics. For context, we will recount how similar ideas were used to connect the thermodynamic effective description of artificial machines from the industrial age to the first-principles theory of microscopic components provided by statistical mechanics. In order to make progress on deep learning, we will need to understand the statistics of initialized deep networks and determine the dynamics of such an ensemble when learning from data. To make this tractable, we will have to take the structure of neural networks into account. Developing a perturbative 1/n expansion around the limit of infinite hidden-layer width, we will find a principle of sparsity that will let us describe effectively-deep networks of practical large-but-finite-width networks. We will thus see that useful neural networks should be sparse -- hence the preference for larger and larger models -- but not too sparse -- so that they are also deep.

This talk is based on a book, "The Principles of Deep Learning Theory," co-authored with Sho Yaida and based on research also in collaboration with Boris Hanin. It will be published next year by Cambridge University Press

 

Organised by

George Redlinger