Python API Reference
- class themis.Themis
The
themis.Themisclass can be used to launch ensembles with a single batch allocation. It is central to everything that can be done with thethemispackage.
Creating a New Ensemble
To create a new Themis ensemble, there are two necessary steps:
Create an iterable of
themis.CompositeRunorthemis.Runobjects to describe the runs of the ensemble.Call one of the
Themis.create*methods listed below, passing it the iterable ofCompositeRun/Runobjects that you just created, and whatever optional arguments you choose. (IfRunobjects are passed, theapplicationargument must be supplied as well.)
A directory should now have been created named
“.themis_setup” (or, if you set the optional setup_dir argument in step 2, with
an arbitrary name and location). This directory will be needed in the future to
interact with the ensemble. See the Themis’s Setup Directory section for more.
- class themis.Step(args, tasks=1, cores_per_task=1, gpus_per_task=0, timeout=0, batch_script=True, cwd='.')[source]
Instances of this class represent the execution of one application.
Instances do not do anything themselves; they are meant to be passed into other methods and functions.
Many attributes map directly to
lrun,srun, andflux mini runarguments.- Parameters:
args (str or Sequence of str) – the application and its arguments, either as a list of strings or as a single string, e.g.
"/bin/bash -c 'exit 1;'"or["/bin/bash", "-c", "exit 1;"]. If a single string is passed, it will be converted to a list viashlex.split(args). The first element of the Sequence should be the absolute path to the application to execute.tasks (int, optional) – total MPI tasks. Corresponds to srun, lrun, and flux “-n” options.
cores_per_task (int, optional) – cores per task. Corresponds to srun, lrun, and flux “-c” options.
gpus_per_task (int, optional) – gpus per task. Corresponds to lrun and flux “-g” option.
timeout (int, optional) – the max running time, in minutes, for this run; if the limit is exceeded, the run will be killed.
batch_script (bool) – a boolean indicating whether the application given by
args[0]is a batch script. If it is, it will be parsed for tokens and copied intocwdbefore execution.cwd (str) – the working directory for the application given by
args[0]. If the path is relative, it is treated as relative the run directory for the owning run. By default, the CWD is the run directory.
- class themis.CompositeRun(sample, steps)[source]
Instances of this class represent one run of a Themis ensemble.
CompositeRuninstances are the most general kind of run in an ensemble, and consist of two parts: a mapping defining arbitrary key-value pairs, and a sequence ofStepobjects.Instances do not do anything themselves; they are meant to be passed into other methods and functions.
Many attributes map directly to
lrun,srun, andflux mini runarguments.- Parameters:
sample (Mapping) – a mapping from sample labels to values, such as one produced by iterating through a
Samplesobject. Generally, sample labels (the keys of the mapping) should be constant across all runs. The sample will be used for parsing text files.steps (Sequence) – nonempty sequence of
themis.Stepobjects defining the execution instructions for the run. EachStepobject owned by the run will be executed in order: firststeps[0], thensteps[1], thensteps[2]and so on.
- class themis.Run(sample=None, args=None, tasks=1, cores_per_task=1, gpus_per_task=0, timeout=0)[source]
Represents one simple run of a Themis ensemble.
This class should be named
SimpleRunbut is not for backwards compatibility.The
Runclass is a restrictive, simplifying derivation of theCompositeRunclass.Runinstances support only a singleStep, i.e. only a single application makes up the run. Other limitations include:All
Runinstances in an ensemble are assumed to share the same application. As a consequence, whether or not the application is a batch script is set on an ensemble-wide basis.The working directory for each
Runin an ensemble is assumed to be specified by therun_dir_namesensemble-wide constant.
Instances do not do anything themselves; they are meant to be passed into other methods and functions.
Many attributes map directly to
lrun,srun, andflux mini runarguments.- Parameters:
sample – a mapping from sample labels to values, such as one produced by iterating through a
Samplesobject. Generally, sample labels (the keys of the mapping) should be constant across all runs. The sample will be used for parsing text files.args (str, optional) – the command-line arguments to pass to the application.
tasks (int, optional) – total MPI tasks. Corresponds to srun, lrun, and flux “-n” options.
cores_per_task (int, optional) – cores per task. Corresponds to srun, lrun, and flux “-c” options.
gpus_per_task (int, optional) – gpus per task. Corresponds to lrun and flux “-g” option.
timeout (int, optional) – the max running time, in minutes, for this run; if the limit is exceeded, the run will be killed.
- classmethod Themis.create(application=None, runs=None, run_parse=None, run_copy=None, run_symlink=None, run_dir_names=None, app_interface=None, setup_dir='./.themis_setup', max_restarts=0, max_failed_runs=None, app_is_batch_script=True, use_flux=False, abort_on=None)[source]
Create a new Themis ensemble.
Returns a
themis.Themisinstance representing the new ensemble.Only when one of the
execute_*method is called does the ensemble actually begin.- Parameters:
application (optional, str) – Path to the application to be executed. For instance,
"/usr/bin/ls","ls", or"../../bin/ls"might all be valid. This option is only relevant ifthemis.Run(rather thanCompositeRun) objects are passed to this ensemble. If onlyCompositeRunobjects are passed, the application is taken on a per-step basis fromstep.args[0].runs (optional) – An iterable of
themis.CompositeRunobjects, representing the runs of this ensemble. Runs within the ensemble will be executed concurrently. Each run is assigned a unique integer ID (the “run ID”) corresponding to its index in the iterable. If no runs are passed, an empty ensemble is created.run_parse (str or iterable of str, optional) – A file path or iterable of file paths. Unix-style path patterns are supported as well. The files specified should be text files; they will be hard-copied into the run directories and parsed. See here for more.
None, the default, specifies that no files will be parsed.run_copy (str or iterable of str, optional) – A file path or iterable of file paths. Unix-style path patterns are supported as well. The files/directories specified will be hard-copied into the run directories.
None, the default, specifies that no files will be copied.run_symlink (str or iterable of str, optional) – A file path or iterable of file paths. Unix-style path patterns are supported as well. The files/directories specified will be symlinked into the run directories.
None, the default, specifies that no files will be symlinked.run_dir_names (str, optional) – The file system paths of the run directories. This argument should be a python format string, where the field names correspond to the names of the variables in the samples argument. For instance, if the variables in the samples argument are “hydrostatics” and “viscocity”, you might pass in the string
"hydro={hydrostatics}/visc={viscocity}". This string will be formatted each run to yield the run directory; so one directory might behydro=17.6/visc=35. Note that the posix directory separator character “/” in the example string means that the resulting run directory will be in fact a sequence of two directories. The default value ofNonelets the naming scheme be determined internally.app_interface (str, optional) – A file path to an application interface module.
setup_dir (str, optional) – A file system path indicating the directory in which to put the ensemble’s setup files.
max_restarts (int, optional) – the maximum number of times an individual run should be restarted if it fails—i.e. if the application exits with a nonzero return code. Setting this argument to
Noneallows infinite restarts.max_failed_runs (int, optional) – the maximum total failures across the ensemble. If this number is exceeded, the ensemble will abort. The default of
Noneallows infinite failures.app_is_batch_script (bool, optional) – A boolean indicating whether the application given by the
applicationparameter is a script that will launch parallel applications. For instance, if the application is a Slurm batch script that contains sruns, then this argument should be set to True. See the batch script info page for more detail.use_flux (bool or str, optional) – a boolean indicating whether to use Flux as the resource manager for the ensemble instead of the machine’s native manager (e.g. Slurm). Can also be a string identifying the path to the Flux installation to use.
abort_on – a sequence of positive integers. Each integer identifies a OS process return code. If
applicationexits with one of those return codes, that run will be marked asRUN_ABORTand will not be restarted.
- Returns:
a
Themisobject representing a new ensemble- Raises:
ValueError – if an ensemble exists in
setup_dir.
- classmethod Themis.create_overwrite(*args, **kwargs)[source]
Create a new ensemble, removing an old one if it exists.
Takes the same arguments as
Themis.create().- Returns:
a
Themisobject representing a new ensemble
- classmethod Themis.create_resume(*args, **kwargs)[source]
Create a new ensemble, or get a handle to an existing one.
If an ensemble exists, return a
Themishandle to it. Otherwise, create a new ensemble with the given arguments and return aThemishandle to it.Takes the same arguments as
Themis.create().- Returns:
a
Themisobject representing either a new or an existing ensemble
Interacting with an Existing Ensemble
Once an ensemble has been created,
you can interact with it by creating a new themis.Themis object, or
through a command-line interface. Either way, you may need the path to the
setup directory (the setup_dir), which defaults to “.themis_setup”.
- Themis.__init__(setup_dir='./.themis_setup')[source]
Constructor. Used to get a handle to an existing Themis ensemble.
Status Information
Each run of a Themis ensemble can have one of four statuses: queued, which is any run which has not yet completed; successful; failed; aborted, or killed.
These statuses are represented by five constants:
themis.Themis.RUN_QUEUED, themis.Themis.RUN_SUCCESS,
themis.Themis.RUN_FAILURE, themis.Themis.RUN_ABORTED,
and themis.Themis.RUN_KILLED.
- Themis.filter_by_status(*statuses)[source]
Return an iterable of run IDs representing runs with a given status.
The return value is not guaranteed to be perfectly accurate–there may be considerable delay in propagating a run’s status change to the caller of this method.
- Parameters:
statuses – zero or more
themis.Themis.RUN_*objects.- Returns:
an iterable of integer run IDs, representing runs with one of the given statuses. If statuses is empty, return all run IDs.
- Themis.count_by_status(*statuses)[source]
Return a count of runs with a given status.
This method is equivalent to calling
len()on the results offilter_by_status; however, this method may be considerably faster for large ensembles.- Parameters:
statuses – same meaning as in
filter_by_status.- Returns:
the number of runs with one of the given statuses. If statuses is empty, return the total number of runs.
Ensemble Summaries
- Themis.write_csv(stream)[source]
Write a CSV containing a full set of data about the ensemble.
- Parameters:
stream – a file-like object that supports writing
str- Raises:
ValueError – if the results cannot be formatted into a csv
- Themis.write_yaml(stream)[source]
Write a YAML document containing a full set of data about the ensemble.
- Parameters:
stream – a file-like object that supports writing
str- Raises:
ImportError – if
yamlcannot be imported
- Themis.write_json(stream)[source]
Write a JSON document containing a full set of data about the ensemble.
- Parameters:
stream – a file-like object that supports writing
str
- Themis.as_dataframe(include_none=True)[source]
Return a pandas DataFrame containing information about each run.
The DataFrame will be indexed on run IDs, and have columns for each sample label.
- Parameters:
include_none (bool, optional) – whether to include rows with no result
- Raises:
ImportError – if
pandascannot be imported
Manipulating and Adding Runs
- Themis.dequeue_runs(run_ids)[source]
Dequeue one or more incomplete (i.e. queued) runs.
Dequeued runs have their status set to
RUN_KILLEDafter a delay. If the run has already finished when this method is called, no action is taken. If the run has not yet started, that run will never be started.- Parameters:
run_ids – an iterable of integer run IDs.
- Themis.requeue_runs(run_ids, hard=False)[source]
Requeue one or more completed runs.
Restarted runs have their status reset to
RUN_QUEUEDand will be executed at the earliest opportunity. If the run has not completed by the time this method is called, no action is taken.- Parameters:
run_ids – an iterable of integer run IDs.
hard – if
True, reset the given runs’ progress back to step 0. IfFalse, maintain the runs’ progress, but re-attempt execution.
Miscellaneous
- Themis.run_dirs(nonexistent=False)[source]
Return an iterable of (run_id, working_directory) pairs in arbitrary order.
- Parameters:
nonexistent (bool, optional) – if
True, include directories that haven’t yet been created.
- Themis.on_completion(args, cwd='.', stdout=None, stderr=None)[source]
Provide an executable for Themis to launch once the ensemble has completed.
The executable will be invoked during a call to
Themis.execute_*once every run has finished.Only the most recent call to this function is remembered.
- Parameters:
args – an iterable of str defining the executable and its arguments, e.g.
["/usr/bin/python", "-vvv", "myscript.py" "--foobar"].cwd (str) – the working directory for the executable when it is launched
stdout – file to redirect stdout to; default redirects to a Themis log file
stderr (str) – file to redirect stderr to; default redirects to a Themis log file
Executing an Existing Ensemble
Executing a Themis ensemble is as simple as calling either the execute_alloc or
execute_local method. execute_local is used to execute on the current machine
or set of nodes without requesting a new allocation. To execute
within a new batch allocation, describe the desired allocation with
themis.Allocation and pass that to mgr.execute_alloc.
The ensemble should then begin running as soon as the allocation is acquired from
the machine’s resource manager, or immediately if execute_local was used.
The dry_run function can be used to confirm that an ensemble is configured
properly without committing to execution.
- Themis.dry_run(*run_ids, **kwargs)[source]
Perform one or more dry runs. If run_ids is empty, dry-run every run.
Populate each run directory with the “run_symlink” and the “input deck” files, and call the application interface’s
prep_runfunction, if it exists.This is done in serial, so is safe to do on login nodes of HPC clusters.
- Parameters:
run_ids – the run IDs of the runs to dry-run.
verbosity – If > 0, print messages about the progress of the dry runs.
- Themis.execute_local(blocking=False, timeout=None, parallelism=None, allow_multiple=False, max_concurrency=None)[source]
Execute the ensemble locally. Return
None.- Parameters:
blocking (bool) – if
True, block until the ensemble exits.timeout (int, optional) – length in minutes to run Themis for. Default is no time limit
parallelism (int, optional) – if set to a positive integer N, reserve N cores for Themis itself, and run Themis in parallel on those cores. If you suspect Themis’s performance is bottlenecking your ensemble, increase the parallelism.
allow_multiple (bool, optional) – if
True, enable multiple concurrent executions (i.e. multiple simultaneous calls toexecute_allocorexecute_local). As this makes certain checks and behavior impossible, it is not the default.max_concurrency (int, optional) – the maximum number of concurrent runs Themis should allow at any given time. Too low a number may cause your ensemble to progress very slowly, much too large a number may eventually cause performance degradation on certain systems. If
None, pick a reasonable default value.
- Themis.execute_alloc(nodes=1, partition=None, bank=None, name='themis', timeout=None, repeats=0, parallelism=None, early_stop=None, allow_multiple=False, max_concurrency=None)[source]
Request an allocation and launch the ensemble within it.
Requests an allocation from current machine’s resource manager and returns the job ID.
- Parameters:
nodes (int) – sets the allocation size in terms of total nodes.
partition (str) – the compute partition to use for the ensemble.
bank (str) – the bank to use for the ensemble, e.g. “wbronze”. The default value of None allows the resource manager to choose the bank.
name (str) – the name to give the allocation.
timeout (int) – the time limit to request for the allocation, in minutes.
repeats – the number of times to replicate the allocation if time expires but the ensemble is not yet complete.
parallelism (int, optional) – if set to a positive integer N, reserve N cores for Themis itself, and run Themis in parallel on those cores. If you suspect Themis’s performance is bottlenecking your ensemble, increase the parallelism.
early_stop (int, optional) – a positive integer indicating the number of minutes “early” that Themis should stop launching new runs. Once Themis has run for
(timeout - early_stop)minutes, it will go to sleep until the allocation expires. While Themis is sleeping, no new runs will be launched, and updates to existing runs will be ignored. This option is generally only useful if an application actively checks the remaining time on an allocation.allow_multiple (bool, optional) – if
True, permit multiple concurrent executions (i.e. multiple simultaneous calls toexecute_allocorexecute_local). As this makes certain checks and behavior impossible, it is not the default.max_concurrency (int, optional) – the maximum number of concurrent runs Themis should allow at any given time. Too low a number may cause your ensemble to progress very slowly, much too large a number may eventually cause performance degradation on certain systems. If
None, pick a reasonable default value.
- Returns:
the job ID of the allocation
Utility Functions
The following miscellaneous functions are designed to make using Themis easier.
Creating/Restarting Ensembles
To check if a directory contains a valid Themis ensemble, there is the
exists function:
- classmethod Themis.exists(setup_dir='./.themis_setup')[source]
Return
Trueif an ensemble’s setup exists insetup_dir.- Parameters:
setup_dir – has the same meaning as the
setup_dirparameter to thethemis.Themis.__init__method.
And for removing a Themis ensemble:
- classmethod Themis.clear(setup_dir='./.themis_setup')[source]
Clear files generated by the ensemble manager for its own use.
This function won’t remove any user-provided files–for instance, input decks or required files. However, it will remove the ensemble manager’s own log files, and any other files it uses to store its information. This includes stored samples and results; therefore, be careful that this is what you actually want to do.
Debugging
The most difficult aspect of creating an ensemble is the
application interface. In order to
facilitate development and debugging, Themis exposes
the following methods.
- Themis.call_post_run(run_id)[source]
Call the application interface’s
post_runfunction for a single run.- Parameters:
run_id (int) – the run ID of the run to execute
post_runfor.- Returns:
the return value of the
post_runfunction.Noneif the function does not exist.
- Themis.call_post_ensemble()[source]
Call the application interface’s
post_ensemblefunction, if it exists.