Application Interfaces

Note

It is strongly recommended that you use the Themis runtime API instead of defining an application interface. Defining an application interface makes things harder to read and debug.

Themis is a general-purpose tool, and it has no understanding of the application it is executing. It does not know, for instance, what kind of results it produces, or anything similar. To understand the application, it relies on the user (you) writing interface code. How simple this is depends a great deal on your application, but we hope it is relatively straightforward.

The interface code should take the form of a python module (a “.py” text file) which defines functions with specific names. The path to the module should be passed to Themis when the ensemble is created; Themis will import it, and at defined points during the ensemble, those functions will be called. The interface functions are responsible for interfacing with the application–whether it is preparing the application for a new run or collecting results from an existing run.

The functions which may be defined by an application interface (each of them is optional) are described below. If you create an application interface, you may define as many or as few of these as you want. The usefulness of each of the functions depends on your particular application and your intentions. Each of these functions should be callable with no arguments.

Recognized Functions

The `prep_ensemble` function

This function is only ever called once. It is called as soon as the ensemble starts on the batch allocation, and before it launches any runs.

This function might be used to build some common files that all of the runs will share—for instance, generating a mesh. However, for most cases, you could generate those files yourself on the login node or inside an allocation before launching the ensemble instead of putting the same code in prep_ensemble.

Runs will not begin to be executed until prep_ensemble returns; therefore, in the interests of not wasting compute resources, prep_ensemble should execute quickly.

No value should be returned from prep_ensemble; any return value will be ignored.

prep_ensemble will be called only once. If an ensemble is restarted, prep_ensemble will not be called a second time.

The `post_ensemble` function

This function is called after the entire ensemble is finished, and all existing runs have completed.

Might be used to, for instance, clean up some files. Like prep_ensemble, your compute allocation will be almost entirely inactive while this function runs. Speed is therefore in your best interest.

This function is generally called only once. However, should post_ensemble add new runs via the themis.Themis.add_runs method, Themis will complete those new runs, then call post_ensemble again. In this case, be careful you do not get into an infinite loop.

No value should be returned from post_ensemble; any return value will be ignored.

The `prep_run` function

Called before each new application run starts. This might be used to prepare an individual application instance for execution.

When this function is called, the current working directory will be the directory where this run of the application will be executed.

The `post_run` function

Called after each application run completes successfully (i.e. with a returncode of 0). This function is meant to be used to determine whether that run succeeded, and then (if it did succeed) harvest output of interest.

If this function returns 1e99, the run will be restarted, or marked as a failure if the restarts have been exhausted. Otherwise, the object returned will be stored internally and made available through a themis.Themis instance. If you do not write a post_run function, Themis will not collect any results.

Any combination of built-in or standard library types (int, float, str, dict, list, tuple, or decimal.Decimal, to name the most common) can be returned and safely stored: the object will be returned to you unchanged. The same guarantee cannot be made for custom types (e.g. instances of a class defined in your code).

When this function is called, the current working directory will be the directory where this run of the application was executed.

The user_utils Module

The ensemble.user_utils module exports certain functions that are designed to be called from an application interface.

Since the application interface functions (prep_run, etc.) take no arguments, they do not automatically have access to any information about the ensemble. For this reason, certain information is exported, so that, for instance, prep_run can find out what run it is, what the sample point is for that run, and how many total runs there are. This information is exported to the ensemble.user_utils module; to access it, just add from themis import user_utils to the top of your application interface.

If any of these functions are called outside an application interface, their behavior is undefined.

Many of the exported functions are designed to be used by specific application interface functions, and their behavior may change depending on the caller (e.g. a function only meant to be used by prep_ensemble might raise an error when called by post_run); consult the documentation of each function for more information.

themis.user_utils.run()[source]

Return the themis.CompositeRun object for the current run.

This can be used to obtain the sample, arguments, and resource requirements for the current run.

For use by prep_run or post_run. Returns None if called by prep_ensemble or post_ensemble.

themis.user_utils.run_id()[source]

Return the integer ID of the current run.

For use by prep_run or post_run. Returns None if called by prep_ensemble or post_ensemble.

themis.user_utils.stop_ensemble()[source]

Stop the ensemble from initiating any new runs.

Existing runs will complete, and then the ensemble will exit after calling the user’s post_ensemble (if it exists).

themis.user_utils.themis_handle()[source]: Return a themis.Themis for this ensemble.

Examples

These examples are meant to be simple, to illustrate how these functions work and what can be done with them.

Adding new samples in the post_ensemble function:

import themis
from themis import user_utils

def new_sample(i):
    raise NotImplementedError("This part is up to you")

def post_ensemble():
    """Add points until there are 50 total runs"""
    manager = user_utils.themis_handle()
    total_runs = manager.count_by_status()
    if total_runs < 50:
        manager.add_runs(
            [
                themis.Run(new_sample(i), None, tasks=5)
                for i in range(total_runs, total_runs + 5)
            ]
        )

Template

A template application interface, to be filled in as you see fit.

from themis import user_utils


def prep_ensemble():
  """Prepare ensemble-wide files or other persistent data."""
  pass


def post_ensemble():
  """Shut down and clean up the ensemble."""
  pass


def prep_run():
  """Prepare an application run for execution."""
  pass


def post_run():
  """Finish an application run, determine whether it succeeded, and return results."""
  pass

Common Issues

Global Variables

The behavior of any of the four functions documented here should not depend on global variables set by another of the four functions. For instance, prep_run should NOT set a global variable that post_run checks. If you attempt to do this, it will not work. The python process that calls prep_run will not be the same as the one that calls post_run; therefore, the global variable initialized by prep_run will be lost. The same is true of any of the other functions. If you wish to pass some kind of state between the various application interface functions, you must do it some other way (perhaps by writing to a file).

Imports

Because the application interface is imported by Themis, sys.path may not be laid out how you expect; imports that you think should work, or that work when you use your application interface locally, may not work at all when Themis uses it. The only real solution to this problem is to be experienced with Python’s import system. Knowing how to import a script or package at a specific absolute path is very useful. Consult a UQP developer if the problems are persistent.