28 November 2023 to 1 December 2023
Catholic University of America, Washington D.C.
US/Eastern timezone
Artificial Intelligence for the Electron Ion Collider

Scalable AI/ML Workflow Management Across Distributed Heterogeneous Resources With PanDA

30 Nov 2023, 14:40
25m

Speaker

Wen Guan (BNL)

Description

The Production and Distributed Analysis (PanDA) system originating in LHC's ATLAS experiment has been steadily evolving upon a technical foundation of proven scalability and extensibility, extending in recent years to new experiments (Rubin Observatory, sPHENIX) and new capabilities in managing large scale, complex workflows across diverse geographically distributed resources. AI/ML has been a particular focus, with support for processing intensive workflows benefiting from extensive automation and access to large scale resources, such as hyperparameter optimization (HPO). This presentation will introduce PanDA, its complex workflow management capabilities and their practical application in HPO and other AI/ML workflows. Finally, PanDA's role in a new R&D program providing a scalable and distributed workflow engine for AI-assisted EIC detector design will be described.

Primary authors

Christian Weber (Brookhaven National Laboratory) Dr Rui Zhang (University of Wisconsin-Madison) Tadashi Maeno (BNL) Wen Guan (BNL) Torre Wenaus (BNL)

Presentation materials