SDCC Liaison Meeting

US/Eastern
3-192 (Bldg 510)

3-192

Bldg 510

Kevin Casella (SDCC), Saroj Kandasamy (BNL), Tony Wong (Brookhaven National Lab (Physics Department))
Description

Join via BlueJeans (https://bluejeans.com/819381923). Passcode not required. You can also join via phone (Meeting ID: 819 381 923), by calling one of the numbers below:

+1.408.740.7256 (United States)
+1.888.240.2560 (US Toll Free)
+1.408.317.9253 (Alternate number)

Thursday April 9, 2020 Liaison Meeting Minutes

 

Facility News

  • CYBERSECURITY is in discussion with SDCC, working towards a password policy proposal after krb5 www intrusions

Network & Facility Operations

  • recent steady pace of on-site work resuming after MINSAFE ended
  • progress for purchases and deployments of  CSI-dedicated tape library, mover, and JBOD, Lustre MDSes, and GS JBODs

Storage

  • JBODs arrived now, targeting next week for deployment

Fabric

  • New Supermicro cluster in production for STAR, PHENIX, and ATLAS T1
  • Singularity 3.6.1 released bug fix from 3.6.0. After it’s in EPEL we plan to test and then push to the farm
  • HTCondor T3 reorganized group mappings to institutions, and now institutions will be marked with a ClassAd
  • Discussion over pulling hardware from the pool for a dedicated sPHENIX Jenkins HTCondor backend

General Services

  • lookout for announcements on RHEV maintenance, rebooting web servers, ssh gateways, and AFS servers
  • user accounts SDCC would like to remove the legacy 8-char limit to username when accounts are requested
  • CYBER update on incident with PHENIX web server, moving forward with a 3-phase plan:
  1. block external access and restore campus access (read/write)
  2. exract some external facing programs and then restore external access (read-only and no executable scripts)
  3. improve/update the full-stack of the remaining external facing programs

Tools & Services

  • sPHENIX migrating Invenio into production, planning to backup and version control with Gitea

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`

Topical Discussion:  Data Lifecycle Management

NFS home directories (NFS)

  • accessible at SDCC from compute and data transfer nodes for personal work area
  • designed for and optimized for small to medium files, no high bandwidth or high IOPS to data
  • daily backups are kept for 90 days and daily snapshots are kept for 7 days
  • small quota/user (~5GB/user), varies by experiment

Project filesystem (GPFS and Lustre)

  • accessible at SDCC from compute and data transfer nodes
  • targeted primarily at collaborations for high bandwidth, large block access to files
  • no snapshots and no backups
  • some project filesystems, particularly but not limited to “scratch”, may have policy driven auto-deletion

Software distribution filesystem (OpenAFS and CVMFS)

  • read-only, world accessible filesystems
  • designed for worldwide, read-only distribution of applications and libraries

Object storage (BNLBox)

  • Default quota is 50GB/user
  • Write access is limited to authorized users, read-access controlled by data owner
  • built-in file versioning and trash bin (30 day retention up to quota limit policy)
  • Data in BNLBox Archive folder doesn’t count against user quota and is transparently migrated to HPSS by policy

Cold storage (HPSS tape)

  • This service is tailored for large files ( files > 10GB)
  • Access to cold storage must be negotiated with SDCC
  • HPSS service performance can vary dramatically depending on access and quality of service requirements
  • Planning to preserve Archive class of storage for existing experiments
  • Exploring transition to Lustre backend to HPSS for transparent tape backups from the user
There are minutes attached to this event. Show them.
    • 1
      Facility News & Announcements
      Speaker: Tony Wong (Brookhaven National Lab (Physics Department))
    • 2
      Facility Operations
      • Network
      • Storage
      • Fabric
      • General Services
      • Tools & Services
      Speakers: Mr Alexandr Zaytsev (Brookhaven National Laboratory (BNL)), Christopher Hollowell (RHIC/ATLAS Computing Facility), Hironori Ito (Brookhaven National Laboratory), Jason Smith (RACF/SDCC), Tony Wong (Brookhaven National Lab (Physics Department))
    • 3
      Topical Discussion
      • Data lifecycle management
      Speaker: Jason Smith (RACF/SDCC)