EIC Software Tutorial: Uproot and awkward array

US/Eastern
Description
Jim Pivarski, known for innovative software in DIANA/HEP, IRIS-HEP and other projects, will give tutorials on how to process and analyze Root files with pure Python libraries: uproot and awkward array 
 
The tutorials are aimed for anyone who would like to develop the EIC Science and detectors further using modern data science tools in Python. An output of ESCalate framework will be used as an example. However, the tutorial will be very general and useful for studies in other frameworks. 
 
The tutorials will go through the full spectrum of what might be needed for the analysis, such as:
  • Getting data
    • Exploring a TFile and TTrees
    • Iterating over chunks of large datasets and over many files
    • Reading histograms and other objects
    • Writing objects and TTrees back to root files
  • Manipulating data
    • Iteration in Python vs array-at-a-time operations
    • Filtering (cuts) events and particles with advanced selections
    • Flattening for plots and regularizing (rpad, clip) to NumPy for machine learning
    • Broadcasting flat arrays and jagged arrays
    • Combinatorics and reducing from combinations.
    • Imperative, but still fast, programming in Numba
    • Grafting jagged data onto Pandas
    • NumExpr, Autograd, and other third-party libraries
The materials for the tutorial is located here : https://github.com/jpivarski/2020-04-08-eic-jlab

A recording is available on the YouTube: https://www.youtube.com/watch?v=FoxNS6nlbD0 

The agenda of this meeting is empty