Are you ready for working on the grid? Take a look at the following prerequisites.
You need to go through Computing GettingStarted. In short, you need:
A system with SL6 (or SLC6).
This requirement will change in the near future gbasf2 update.
Don't you have one? If Singularity is avaiable in the site that you are working, it is easy to work with SL6:
$ singularity shell --cleanenv --bind /cvmfs:/cvmfs docker://sl:6
A valid grid certificate issued within a year and installed in ~/.globus
and on the web browser.
The format required to work with DIRAC is a PEM format.
See Computing GettingStarted for details.
openssl x509 -in ~/.globus/usercert.pem -noout -subject -dates
chmod 400 userkey.pem
)ls -l ~/.globus
Registration in DIRAC.
Do you have everything? Good, let's start.
Running gBasf2 requieres some initial configuration
$ cd <path where you want your installation>
$ wget http://belle2.kek.jp/~dirac/dirac-install.py
$ python dirac-install.py -V Belle-KEK
$ source bashrc
$ dirac-proxy-init -x # <-- You will need to enter the passphrase of your certificate
$ dirac-configure defaults-Belle-KEK.cfg
$ source ~/gbasf2/BelleDIRAC/gbasf2/tools/setup
$ gb2_proxy_init -g belle
A more complete description of the procedure is here.
You need to install Jupyter and the bash kernel in the gbasf2 enviroment
$ pip install jupyter bash_kernel --trusted-host pypi.python.org --trusted-host pypi.org --trusted-host files.pythonhosted.org
$ python -m bash_kernel.install
(this will be included in future gBasf2 releases).
Once this is done, you can run the Jupyter notebook server.
$ jupyter notebook
If you want to run in a remote site, you need to follow the instructions for running Jupyter notebooks at remote sites here.
To achieve the physics goals of the experiment, data has to be distributed, reprocesed and analyzed.
Belle II is expected to produce tens of petabytes of data per year.
Under this situation, it cannot be expected that one site will be able to provide all the computing resources for the collaboration.
This is how the Belle II grid looks like:
You can also take a look at the Belle II grid in real time at https://belle2.jp/computing/.
The usual workflow is:
A command line client, gbasf2
, is used for submitting grid-based Basf2 jobs.
The gb2_
tools control and monitor the jobs and the files on the grid.
We will review briefly their usage in this tutorial.
Once again: before submitting jobs to the grid, be sure that your script works well on a local computer!
For this tutorial, we will use an example in the tutorials under /cvmfs
called B2A101-Y4SEventGeneration.py
, which generates e+e- -> Y(4S) -> BBbar events.
For convenience, we will store the location in a bash variable:
basf2TutorialsDir='/cvmfs/belle.cern.ch/sl6/releases/release-03-02-03/analysis/examples/tutorials'
Take a look inside B2A101-Y4SEventGeneration.py
. It is a normal Basf2 steering file.
head -n 50 $basf2TutorialsDir/B2A101-Y4SEventGeneration.py && echo etc...
You must run it in a Basf2 environment to confirm that it works properly.
On the grid, only the most recent libraries installed under /cvmfs/belle.cern.ch
are avaiable.
Let's use release-03-02-03
.
# Setting Basf2 environment in the Notebook
export BELLE2_NO_TOOLS_CHECK=1
source /cvmfs/belle.cern.ch/sl6/tools/b2setup release-03-02-03
basf2 -n 10 -l WARNING $basf2TutorialsDir/B2A101-Y4SEventGeneration.py
It is to be noted that Basf2 environment is incompatible with the gBbasf2 environment. After runing the Basf2 steering file above, we need to return to the gBasf2 enviroment:
# This will work only if you have generated your proxy on the terminal before (gb2_proxy_init -g belle)
source ~/gbasf2/BelleDIRAC/gbasf2/tools/setup
We cannot run gb2_proxy_init
in the notebook, since it will ask for your passphrase and the notebook doesn't work with interactive input, unfortunatelly.
To confirm everything is properly set, we can simply run gb2_proxy_info
.
gb2_proxy_info
It shows information about your proxy, as the identity, the time left, the DIRAC group (belle), etc.
By default, your proxy will be valid for 24 hours.
We will use gbasf2
to submit jobs to the grid. The basic usage is
$ gbasf2 your_script.py -p <project_name> -s <available_basf2_release>
where project_name
is a name assigned by you, and avaiable_basf2_release
is the avaiable Basf2 software version to use.
You can always use the flags -h
and --usage
to see all the list of avaiable options and a list of examples.
gbasf2 --usage
Remember that the sintax requires a project name and the release.
We are using release-03-02-03
.
If you want to test your syntax before actually submiting the job, you can use the flag --dryrun
:
gbasf2 -p gb2Tutorial_bbbarGeneration -s release-03-02-03 $basf2TutorialsDir/B2A101-Y4SEventGeneration.py --dryrun
Notebooks with bash kernel does not support interactive input. I will use --force
to skip confirmation.
This is fine for a tutorial, however, in your daily work (on the terminal) I strongly suggest to not avoid the confirmation.
(Or at least, test your syntax with --dryrun
first, as we did here).
Everything looks fine. No typos. Let's submit the job.
gbasf2 -p gb2Tutorial_bbarGeneration -s release-03-02-03 $basf2TutorialsDir/B2A101-Y4SEventGeneration.py --force
You have submited your first job to the grid. Congratulations!
How to look at the status of your jobs? There are two ways, command line and web.
In command line, you can use gb2_project_summary
and gb2_job_status
(the flag -p
will specify the project name).
gb2_project_summary -p gb2Tutorial_bbarGeneration
gb2_job_status -p gb2Tutorial_bbarGeneration
The second way is looking at the job monitor in the DIRAC web portal.
Open the portal, click on the logo at the bottom-left and go to Applications/Job Monitor.
You will have to click on 'Submit' to display the information.
You should see something like this:
Once the job has finished, you can list the output using gb2_ds_list
.
The output file will be located below your user space (/belle/user/<username>/<project_name>
)
gb2_ds_list|grep gb2Tutorial_bbarGeneration
gb2_ds_list /belle/user/michmx/gb2Tutorial_bbarGeneration
To download the files, use gb2_ds_get
.
# Let's create a directory to get the files of the tutorial under home
mkdir -p ~/gbasf2Tutorial && cd ~/gbasf2Tutorial
# Now, let's download the file
gb2_ds_get /belle/user/michmx/gb2Tutorial_bbarGeneration --force
(Again, with the flag to skip the confirmation since we are in a notebook)
You can confirm now the file is located at your local home, inside a directory with the name of the project's name.
ls -l ~/gbasf2Tutorial/gb2Tutorial_bbarGeneration
Keep in mind: as far as you have a gBasf2 installation, you can submit jobs or download files from any local machine.
The most common task as user of the grid is the submission of jobs with input files
Usually from the official Belle MC campaings.
From the official data reprocessing and skims.
Let's take a look of how sets of data are handled on the grid.
/belle
. Examples of datasets are
/belle/MC/release-02-00-01/DB00000411/MC11/prod00005678/s00/e0000/4S/r00000/mixed/mdst
/belle/MC/release-03-00-00/DB00000487/SKIM10x1/prod00006915/e0000/4S/r00000/taupair/18570600/udst
/belle/Data/release-03-02-02/DB00000654/proc9/prod00008522/e0007/4S/r04119/mdst
Each dataset is subdivided in datablocks,
subXX
, with an incremental number per each one. For example:gb2_ds_list /belle/MC/release-02-00-01/DB00000411/MC11/prod00005678/s00/e0000/4S/r00000/mixed/mdst
In gBasf2, the data handling unit is the datablock.
The information about produced MC, reprocessed data and skims is located at Confluence, under the Data Production WebHome.
For example, for the MC12 campaing, the table of early phase 3 geometry samples) contains:
If we want to use
the LPN for the datablock desired is
/belle/MC/release-03-01-00/DB00000547/MC12b/
prod00007393
/s00/e1003/4S/r00000/mixed/mdst/sub00
Do you need additional info? We can use gb2_ds_query_dataset
to retrieve the info stored in the metadata catalog.
gb2_ds_query_dataset -l /belle/MC/release-03-01-00/DB00000547/MC12b/prod00007393/s00/e1003/4S/r00000/mixed/mdst
We will use another example stored in the tutorials at analysis/examples/tutorials
, called B2A602-BestCandidateSelection.py
.
It takes as input BBbar "mixed" samples.
cat $basf2TutorialsDir/B2A602-BestCandidateSelection.py
We will use the BGx0 mixed sample mentioned before.
Let's take a look to confirm that the files are there:
gb2_ds_list /belle/MC/release-03-01-00/DB00000547/MC12b/prod00007393/s00/e1003/4S/r00000/mixed/mdst/sub00 |head -n 5
Time to submit the job. The input datablock should be specify with the flag -i
.
Again, I will use the flag --force to skip confirmation.
And again, I strongly recomend to not skip the confirmation unless there is a good reason.
(A Jupyter notebook in a tutorial which cannot handle interactive input is a good reason, right?).
Let's use --dryrun
to test the syntax before actually submitting:
gbasf2 -p gb2Tutorial_BestCandidate -s release-03-02-03 \
-i /belle/MC/release-03-01-00/DB00000547/MC12b/prod00007393/s00/e1003/4S/r00000/mixed/mdst/sub00 \
$basf2TutorialsDir/B2A602-BestCandidateSelection.py --dryrun
Everything seems good. Let's submit the jobs.
gbasf2 -p gb2Tutorial_BestCandidate -s release-03-02-03 \
-i /belle/MC/release-03-01-00/DB00000547/MC12b/prod00007393/s00/e1003/4S/r00000/mixed/mdst/sub00 \
$basf2TutorialsDir/B2A602-BestCandidateSelection.py --force
Congratulations! You are now running jobs on the grid with official MC samples as input.
Let's take a look at the jobs in the project.
Remember, one job per file contained in the datablock will be running.
You can monitor your jobs either in
gb2
tools.gb2_job_status -p gb2Tutorial_BestCandidate
Once the jobs finish, you can download the output using gb2_ds_get
, as we did in the first project.
Sometimes, things do not goes well. A few jobs can fail because a large list of reasons, like
If you find that some of your jobs failed, you need to reschedule these jobs by yourself.
You can use gb2_job_reschedule -p <project name>
:
gb2_job_reschedule --usage | tail -n 13
Or you can use the job monitor in the DIRAC web portal, selecting the failed jobs and clicking the 'Reschedule' button:
If ALL your jobs failed, most probably something is wrong with the steering file or the gBasf2 arguments.
(Did you test your steering file locally before submiting the jobs?)
An useful way to track which was the problem is (if possible) downloading the output sandbox. It contains the logs related to your job.
It is also possible to retrieve the log files directly from the command line using gb2_job_output
:
gb2_job_output -p gb2Tutorial_bbarGeneration
ls -l /home/michmx/gb2_tutorial/log/gb2Tutorial_bbarGeneration/117280956/
cat /home/michmx/gb2_tutorial/log/gb2Tutorial_bbarGeneration/117280956/Script1_basf2helper.py.log
Some pages at Confluence are prepared with additional information:
Take a look to the previous gBasf2 tutorials (they contain some advanced topics not covered here).
You are strongly recommended to join the comp users forum, where you can ask for help and receive announcements on releases and system issues.
You can also ask in questions.belle2.org. Even you can answer questions from other users!
If you have some experience as data production shifter, please become an Expert Shifter.
The Expert Shifter training course is open.
gBasf2 is still under development. There must be many points which you do not satisfy with.
To imporve and make gbasf2 more user-friendly, we need your feedback and help.
But in parallel, the number of gbasf2 developers is limited now.
It is writen in Python. If you are interested in coding, please contact
comp-dirac-devel@belle2.org
and consider to contribute the improvement of gbasf2 :)
On the grid, only the most recent libraries installed under /cvmfs/belle.cern.ch
are avaiable.
gb2_check_release
:gb2_check_release
(It will also confirm that your gBasf2 installation is up-to-date, otherwise an update will be suggested).
Comments, suggestions for improving this tutorial?
@michmx
at chat.belle2.org, or an Email to michmx at phy.olemiss.edu.