The purpose of the tool registry is to define what tools (i.e. domain specific programs)
the application can run and which compute hosts they will run on.
* OVERALL STRUCTURE of tool-registry.cfg.xml:
...
...
...
Any text of the form ${name} is replaced at build time with the value of "name" from your build.properties file
or default/build.properties.template.
* TOOLRESOURCE
=================================================================================================================
Here's an excerpt from the scigap/example application tool registry for running tools
on the same machine as the web application, i.e. "the local host". Should only be used for
quick jobs since SSHProcessWorker blocks until the job completes. (SSHExecProcessRunner is
a non-blocking alternative).
- The "id" can be anything. It will also be referenced in the elements that run on this resource.
- I'm not sure if "type" matters
- class is always "org.ngbw.sdk.tool.DefaultToolResource"
- filehandler: use LocalFileHandler when the file system where jobs will run is local to the web application or nfs mounted.
- processworker: SSHProcessWorker can run jobs on local or remote host.
The framework defines several different filehandlers and processworkers in scigap/sdk/src/main/java/org/ngbw/sdk/tool.
They conform to the interfaces defined in scigap/sdk/api/tool/FileHandler and ProcessHandler.
- Parameters: are passed to both the filehandler and the processworker. LocalFileHandler and SSHProcessWorker
require the "host" and "workspace" parameters. Workspace is where working dirs will be created to run the jobs.
"rc" key is used to source a script during login (eg .bash_profile) to set the environment (including the PATH) properly so the submit.py and other scripts can be found during execution.
The "${workspace}" and "${rc}" values will be replaced at build time with the value of the corresponding value in build.properties as explained in framework_install.txt.
"accountGroup" - this should be set to "teragrid" for proper function of XSEDE/TeraGrid job statistics
"chargeNumber" - also needed for proper function of XSEDE/TeraGrid job statistics; the value is usually set in build.properties and referenced here (i.e. "" )
Different FileHandlers (e.g. SFTPFileHandler) and ProcessWorkers (e.g. SSHExecProcessWorker) will require
different parameters. For the time being, the definitive way to find out which parameters are required is to
look at the source code for the FileHandler and ProcessWorker classes you're using.
=================================================================================================================
Here's an excerpt from Cipres's tool registry for running jobs on TSCC. TSCC is a cluster with a torque/pbs
scheduler. We use the SSHExecProcessWorker and configure it with runner=SSHExecProcessRunner. We also use
SFTPFileHandler. This combination uses sftp to transfer files from cipres to the execution host, then runs
ssh to log into the remote host and execute the script given by the "submit" parameter. The submit script,
installed on the remote host, creates a job submission script and submits it with qsub. Parameters "login"
and "fileHost" are both set to the name of an entry in the ssl.properties file that tells the filehandler and
processworker how to connect to the remote host.
The other parameters are:
- workspace : directory on tscc where job working dirs will be created. Must have ARCHIVE
FAILED and MANUAL subdirs, should be owned by the user account that will be running the jobs.
- rc : This is optional. If specified, this script will be run on the remote host before
running each of the other scripts (like submit,check, cancel). Can be used to set PATH
and other env vars.
- submit, check, cancel are scripts installed on the remote host. See framework_install.txt,
"Running on Remote Compute Hosts".
- coresPerNode is required unless all jobs will be single threaded, using a single core.
This is used to predict the charge for a job.
Here are the corresponding properties in build.properties. Note that the value of tscc.login and tscc.fileHost
is "tscc". This tells the filehandler and processworker to use the entries prefixed with "tscc." in ssl.properties.
Also note that we don't need to give full paths to the scripts submit_v2.py, etc, because they are installed such
that they are on the path of cluster_user@tscc-login1.sdsc.edu.
tool.resource.cluster=TSCC
tscc.login=tscc
tscc.fileHost=tscc
tscc.submit=submit_v2.py
tscc.check=checkjobs_v2.py
tscc.cancel=delete_v2.py
tscc.rc=/home/cluster_user/ngbw/contrib/scripts/workbench.rc
tscc.project=cluster_user
tscc.workspace=/projects/ps-ngbt/backend/tscc_test_workspace
# These two are used in ssl.properties:
tscc.username=cluster_user
tscc.keyFile=@{env.HOME}/.ssh/id_dsa
And in cipres's sdk/src/main/resources/ssl.properties, we have:
tscc.host=tscc-login1.sdsc.edu
tscc.username=${tscc.username}
tscc.keyFile=${tscc.keyFile}
tscc.minconn=0
tscc.maxconn=${ssl.maxconn}
tscc.end
* TOOLGROUPS
=================================================================================================================
Tool Groups are used to define groups of tools for which you would like to achieve some sort of selective behavior.
For example, perhaps you wish to selectively disable all of a particular group of tools. If XSEDE goes down, for example
you can disable all XSEDE tools if you place them in a tool group. In our case, we find it useful to define
a group of tools that appears only in the REST service, and another that appears in both the REST service and
the web portal. To do this we define:
and within this group we define elements for tools that appear only in the REST service.
We set rest.tools.disable=true in our our build.properties file when building the web portal, and we
set rest.tools.disable=false, when building the REST service.
* TOOLS
=================================================================================================================
Within the one or more tool groups, we place the definition of individual tools that are to be deployed. All tools must
have a corresponding PISE XML document.
Here is an example of a tool definition:
The format works like this:
name="description, which is an arbitrary text string"
configfile="path to the pise tool description, usually relative to this file"
toolresource="the id of a toolresource defined above"
commandrenderer="org.ngbw.pise.commandrenderer.PiseCommandRenderer">
The entityType, dataType, and dataFormat are vestiges of a previous attempt to further control the data types. This turned out to be not so useful.
Typically, people will only use UNKNOWN for the values of these parameters. So far, iomode=FILE in all cases. No other option is defined.