Supported Executors
Several different job execution models are supported. You can view these as you would the runner executors as they operate approximately the same way.
Selecting the correct target executor for your deployment is crucial to ensuring that all runner generated scripts are run in the expected manner.
Executors
Declaring the target executor via Jacamar’s configuration is required.
System |
|
---|---|
Cobalt (qsub) |
|
Flux (flux) |
|
LSF (bsub) |
|
PBS (qsub) |
|
Shell (bash) |
|
Slurm (sbatch) |
|
[general]
executor = "flux"
[batch]
arguments_variable = ["SITE_PARAMETERS"]
Additional configuration options exist specifically to manage batch executors. See the batch table documentation.
Cobalt (qsub)
Note
Legacy support only, no new features will be added for this executor type.
Jobs are submitted using qsub
with both the output as well as error logs being monitored.
The runner generated build script is submitted to the scheduler using
qsub
. Both stdout/stderr are managed via the--output
and--error
argument respectively. Finally allSCHEDULER_PARAMETERS
are integrated into the request.Job state is monitored using
qstat
, identifying if the job is currently running on a set interval.Throughout the duration of the job the runner obtains the stdout/stderr by tailing both files.
Upon completion of the job (no longer found in queue) the final exit status is queried using the generated
<jobid>.cobaltlog
to determine if the CI job should pass or fail.
Flux (flux)
The Flux integration leverages flux alloc to submit an interactive job.
The runner generated script is submitted to the scheduler for execution using
flux alloc
. All user definedSCHEDULER_PARAMETERS
are integrated into the allocation request.The interactive session’s stdout/stderr is monitored by the runner and streamed back to the server.
Due to the interactive session the exit status of the
flux alloc
command is used to determine if a job passed or failed.
LSF (bsub)
Note
Legacy support only, no new features will be added for this executor type.
LSF leverages bsub
to submit an interactive job.
The runner generated script is submitted to the scheduler for execution using
bsub -I
. All user definedSCHEDULER_PARAMETERS
are integrated into the request.The interactive session’s stdout/stderr is monitored by the runner and reported back to the server.
Due to the interactive session the exit status of the
bsub
command is used to determine if a job passed or failed.
PBS (qsub)
With PBS we support job submission via qsub
.
The runner generated script is submitted to the scheduler using
qsub
. The runner controls the schedulers-o
(output),-j eo
,-Wblock=true
and-N
(job name) arguments while all user definedSCHEDULER_PARAMETERS
are also integrated.Throughout the duration of the job the runner obtains the stdout/stderr by tailing the file (
pbs-ci-<jobID>.out
). All output to this file is reported back to the CI job log.Once a job has been completed the final state is determined by the exit status of
qsub -Wblock=true ...
Shell (bash)
In many aspects this simply mirrors the GitLab shell executor.
With the key exception that great strides have been taken to dramatically improve security of running jobs even on a multi-tenant environment.
All job scripts are ultimately executed on a shell spawned locally on the running Jacamar instances:
cat generated-script | env -i /bin/bash --login
Though this may add complexity for users with complicated Bash profiles it ensures that they will always get an understandable and most importantly, functional shell environment.
Slurm (sbatch)
The Slurm integration revolves around submitting the job scripts using
sbatch then tailing the subsequently generated --output
log file.
The runner generated script is submitted to the scheduler using
sbatch
. The runner controls the schedulers--output
,--wait
, and--job-name
arguments while all user definedSCHEDULER_PARAMETERS
are also integrated into the request.Throughout the duration of the job the runner obtains the stdout/stderr by tailing the file (
slurm-%j.out
). All output to this file is reported back to the CI job log.Once a job has been completed the final state is determined by the exit status of
sbatch --wait ...
It is important to note that the entire build script
is submitted via sbatch
. As such it will run entirely on the target
compute resources.
CI Job Build Stages
When closely examining a GitLab CI job you may notice a number of distinct shells being generated and scripts launched over the course of said job. This behavior falls in line with the upstream GitLab runner design to breakdown a single CI job into several stages (e.g. git sources, execute build script, etc.), each accomplishing a specific target with the job. In a more traditional shell executor every stage is launched in a similar shell spawned on the host environment. In the case of executors that seek to interface with an underlying scheduler:
To begin the job, necessary preparations are made, sources are obtained (
git
), and artifacts/caches are made available. Each of these stages within the CI job occur on the host environment of the Jacamar.If all previous stages are completed successfully the step script (a combination of the
before_script
andscript
) is submitted to the scheduler.Finally all remaining stages, including the
after_script
, again occur on the node where the Jacamar is located.
Simply put only the user’s before_script
and script
are ever submitted
as a job script to the underlying scheduler. This provides a number of benefits
to the user, chiefly that compute cycles are never wasted on potentially
minimal data management actions (e.g. relocating the runner cache). However,
you will note that the user defined after_script
section is also run on the
host system. This is by design and allows potential users to execute
actions that may otherwise be impossible in a traditional compute environment.