Reconstruction WG Meeting Minutes
October 28th, 2024
11:04 AM - 12:04 PM EDT
==================================


Introduction
Derek Anderson, Shujie Li
=========================
- DOE ai/ml survey: please fill it out at
    https://forms.gle/VH949Pmh1YbTABSf9


pTDR Priorities + UGM Action Items
Derek Anderson, Shujie Li
==================================
- tracking pTDR priorities:
  - Resolve low momentum seed finder inefficiencies
  - Understand why CKF seeder doesn't use last tracker hits
    --> NEEDS A VOLUNTEER!
  - Add random pixel noise in SVT
    --> Doesn't have full solution yet; will re-discuss in Thursday
        meeting
    --> Plan is to generate random cell ID at a random time at a
        given position, but curved surfaces present an issue
  - Secondary Vertexing 

----------

Derek
  - Secondary vertexing is a pTDR priority? What's the current status?
Shujie
  - It is: using ACTS will require so much additional wiring that it will
    effectively be like coding up KFParticle
  - Will explore KFParticle later, and will instead use snippet of track-to-
    track DCA calculation from Barak for pTDR
  - Will have a dedicated discussion to work out clear plan-of-action
Wouter
  - Curved surfaces are issue an for pixel noise? What are the surfaces?
    Just a tube?
Shujie
  - Yes, just a tube
Wouter
  - That should be easy to throw uniformly
Shujie
  - But the issue is how to handle inactive material
  - Might not be a good idea to put it all in digitization
Wouter
  - But would it be good to put dead areas in simulation? It's the
    same material budget...
Shujie
  - It could be complicated
Wouter
  - Once you have a hit chip, you'll have a cellID which gives you the
    chip center and its orientation
  - You don't need to identify the dead area in every chip
Dima
  - Isn't that what we do in the TOF?
Wouter
  - Maybe, we should check
  - And is it a reconstruction priority? It is reco in the sense
    that it happens in EICrecon...
Shujie
  - It is, let's discuss more in tracking reco meeting

----------

- Calo pTDR priorities:
  - Improved truth associations
    --> Follow-up: confirm status == 1 is okay for pTDR
        --> NEED VOLUNTEER!!
    --> Follow-up: make sure there's a DIS sample w/ G4hits
        for beampipe splash studies
    --> To-Do: confirm if contributions are actually in sim campaign
        --> UPDATE [10.30.2024]: THEY ARE NOT
  - Cluster splitting/merging
    --> Follow-up: update to use new track-cluster association (not pTDR critical)
  - Clustering in all systems
    --> To-Do: send reminder to DSCs to check their sampling fractions! They can
        do it with either a benchmark or their analysis code
  - ML integration
  - Noise masking channel-by-channel
  - Make janadot more accessible

----------

Dima
  - Not just status == 1, but also need to check the threshold applied
    to all created particles in tracking region
  - Default is to apply < 1 TeV, but need to check default vs. npsim (it
    applies something much lower)
  - 1 TeV threhsold does something funny: it might create extra particles
    at the tracking region boundary
Wouter
  - Could we check by throwing a status == 1 Kshort or J/psi and see if we
    associate back to the primary or the decays?
Derek
  - Yes to both
Wouter
  - Make sure you know the size differential of adding the sim hits + sim hit
    contributions to the output
    - We're maxing out our storage space with recent campaigns

----------

- PWG action items
  - Propagate x-sections, luminosities, etc.
  - Combine PID likelihood ratios
  - Integrate calos into PID hyptheses
  - Muon ID
  - Track-cluster associations
  - Redouble eID effort and add e/h separation via calos
  - Track-to-track DCA for decay topologies
  - Secondary vertexing
- UGM action items
  - There are LOTS, so check the slides


ML in EICrecon
Dmitry Kalinkin
===============
- We now have ORT available and plan on supporting TMVA::SOFIE
- But PODIO structures don't map onto ORT-compatible tensors directly
  AND we need to save tensors for training/retraining
  - So we need to factorize input to inference (i.e. transforming from
    PODIO to tensors) and output from inference (i.e. transforming from
    tensors to PODIO)
  - Elegantly done by introducing a new datatype, edm4eic::Tensor
- Check it out at
    https://github.com/eic/EDM4eic/pull/96
    https://github.com/eic/EICrecon/pull/1618

Discussion
----------

Shujie
  - Thanks for setting this up!
  - My understanding is that we train the weights, and then apply the weights
    to the analysis we want to do later
    - So how does training work in EICrecon? What happens if you don't
      provide weights?
Dima
  - The convert-to-tensor factory grabs truth info and does conversion;
    then the inference factory tries to grab weights, fails, and gets disabled
    --> Note that weights will get uploaded to epic-data
Shujie
  - So is training exposed to the user?
Dima
  - Training is done by the user elsewhere (outside EICrecon)
Derek
  - Good stuff, make sure people comment on PR #96!
Dima
  - Yes, let's get this merged before the release so we can test it on the
    EICrecon PRs
Simon
  - So what about the associations between the reconstruction-input and the
    reconstruction-output
Dima
  - It's not coded in yet, but it's trivial; I'll do that
Simon
  - Does the convversion depend on the ordering of the collection?
Dima
  - Yes, but that can be changed to have 1 tensor per object (then ordering 
     doesn't matter)
  - But it's set up so that user can play with batching


AOB
===

-- none --