US ATLAS / HTCondor meeting

US/Eastern

USATLAS HTCondor meeting- minutes

(see google doc: https://docs.google.com/document/d/1zTl-HIB07SEWgwB8hLH5O9deX4O-z_ezq1YQI5__VKI/edit?tab=t.0 )

Useful Links:

https://htcondor.org/htcondor/release-plan/



11 DEC 2024:

This is what we are doing at MWT2

# SIGTERM to kill jobs, SIGKILL after 5 minutes

GRACEFULLY_REMOVE_JOBS = true

MachineMaxVacateTime = 5 * 60

 

# cgroup additions for limiting memory to 1.1x the job request for ATLAS and 3x for non-ATLAS

CGROUP_MEMORY_LIMIT_POLICY = custom

CGROUP_HARD_MEMORY_LIMIT_EXPR = ifThenElse(regexp("usatlas[1-4]", Owner), 1.1 * RequestMemory, 3 * RequestMemory)

CGROUP_SOFT_MEMORY_LIMIT_EXPR = ifThenElse(regexp("usatlas[1-4]", Owner), 1 * RequestMemory, 1.1 * RequestMemory)

 

MWT2 testing of cgroups:

 

  1. sudo su usatlas1

  2. condor_submit memory_allocator.submit


condor_allocator.submit contains:

universe = vanilla

executable = memory_allocator

arguments = 1500 30

request_memory = 1024M

log = memory_job.$(Cluster).$(Process).log

output = memory_job.$(Cluster).$(Process).out

error = memory_job.$(Cluster).$(Process).err

should_transfer_files = yes

when_to_transfer_output = ON_EXIT_OR_EVICT

transfer_executable = True

JobPrio = 100000

 

requirements = regexp("cit2", Machine)

 

queue

The executable arguments are:

  1. 1500 is the number of MiB to allocate.

  2. 30 is the number of seconds to wait after the memory is allocated before exiting

    1. There is a signal handler that will log a message on SIGINT or SIGTERM and wait the same amount of seconds before exiting

 

The memory request is 1024 MiB.

The requirement to run on a machine with a name containing cit2 insures that the job runs on a  server that is updated to condor 24.0.2.

 

Judith put the cgroups config related to memory above. The entire contents of /etc/condor/config.d/02-cnode.conf is:

 

use role:execute

use feature:partitionableslot

 

MWT2_CpuUsed  = int((CondorLoadAvg / TotalLoadAvg) * (ifthenelse((TotalLoadAvg < TotalCpus), TotalLoadAvg, TotalCpus)) * 100) / 100.0

MWT2_CpuUsage = ifthenelse(((TotalLoadAvg > 0.0) && (Activity != "Idle")), MWT2_CpuUsed, 0)

MWT2_CpuExceeded  = (MWT2_CpuUsage > (Cpus + 0.8))

MWT2_CpuMemory = int(TotalMemory / TotalCpus)

 

START = TRUE

HAS_CVMFS = TRUE

TRUST_UID_DOMAIN = TRUE

 

STARTD_ATTRS = $(STARTD_ATTRS) HAS_CVMFS MWT2_CpuUsed MWT2_CpuUsage MWT2_CpuExceeded MWT2_CpuMemory

 

# SIGTERM to kill jobs, SIGKILL after 5 minutes

GRACEFULLY_REMOVE_JOBS = true

MachineMaxVacateTime = 5 * 60

 

# cgroup additions for limiting memory to 1.1x the job request for ATLAS and 3x for non-ATLAS

CGROUP_MEMORY_LIMIT_POLICY = custom

CGROUP_HARD_MEMORY_LIMIT_EXPR = ifThenElse(regexp("usatlas[1-4]", Owner), 1.1 * RequestMemory, 3 * RequestMemory)

CGROUP_SOFT_MEMORY_LIMIT_EXPR = ifThenElse(regexp("usatlas[1-4]", Owner), 1 * RequestMemory, 1.1 * RequestMemory)

 

DISABLE_SWAP_FOR_JOB = true

 

IGNORE_LEAF_OOM = false



Opensearch / Adstash

24.0.3/ 23.0.19 includes some improvements to opensearch 2.0 implementation

  • Possible to run a condor 24.0.3 vm with adstash to talk to 23.X cluster

New for late 23.X 24+ in addition to machine ad they now have singular ad for each EP startd daemon ad which has aggregate view of whole machine

Other/ Misc.

Can make use of backfill slots for lower priority jobs that get eviction when higher priority jobs come through. Progress will be lost on evicted jobs

 

HTCONDOR release schedule

HTConor Release Plans


Contact us page for HTCondor - Link to Page

There are minutes attached to this event. Show them.
    • 15:00 15:10
      Intro 10m

      Introductions / brief status of each site's condor deployment

    • 15:10 16:00
      Discussion 50m

      Some topics of discussion:

      -Condor 24 readiness / addressing cgroups issues
      -Mechanism behind job eviction / how it works
      -Condor monitoring / adstash

      Anything else of interest / open discussion