Locked History Actions

Admin/Tools/Custom Code

Custom Code

NOTE: The extensions described here can cause problems using your tool with certain components of Galaxy (like the workflow system). It is highly recommended to avoid these constructs unless absolutely necessary. We are continually adding support for more complex configuration in the tool config to eliminate the need for these features

The purpose of custom code is to provide detailed control on the way the tools are executed. This (optional) code can be deployed in a separate file in the same directory as the tool configuration files (See AddToolTutorial). To enable the code add the

   1 <code file="somefile.py"/>

tag to your configuration file. This instruction will load and execute the somefile.py program upon reading the tool. This program must be a python script that may contain any number of functions or classes. There are four function names that, if available, will be called from within Galaxy.

There are four time points where custom code execution can take place:

1. before the tool starts (the corresponding function name is validate) 1. after 1 but before the tool is placed in the job queue (exec_before_job) 1. after 2 but before the program associated with the tool executed (exec_before_process) 1. after the program associated with the tool finished executing (exec_after_process)

The principal difference between the executions of steps 1, 2 and 3, 4 is that the former block the response meaning that they have to complete before the page response is returned to the user. The latter two happen in the background in an independent thread.

Parameter Validation

This function is called before the tool is executed. If it raises any exceptions the tool execution will be aborted and the exception's value will be displayed in an error message box. Here is an example:

   1 def validate(incoming):
   2     """Validator for the plotting program"""
   3     
   4     bins = incoming.get("bins","")
   5     col  = incoming.get("col","")
   6 
   7     if not bins or not col:
   8         raise Exception, "You need to specify a number for bins and columns"
   9 
  10     try:
  11         bins = int(bins)
  12         col  = int(col)
  13     except:
  14         raise Exception, "Parameters are not integers, columns:%s, bins:%s" % (col, bins)
  15 
  16     if not 1<bins<100:
  17         raise Exception, "The number of bins %s must be a number between 1 and 100" % bins

this code will intercept a number of parameter errors and return corresponding error messages. The parameter incoming contains a dictionary with all the parameters that were sent through the web.

Pre-job and pre-process code

The signature of both of these codes is the same:

   1 def exec_before_job(inp_data, out_data, param_dict, tool):
   2 def exec_before_process(inp_data, out_data, param_dict, tool):

The param_dict is a dictionary that contains all the values in the incoming parameter above plus a number of keys and values generated internally by galaxy. The inp_data and the out_data are dictionaries keyed by parameter name containing the classes that represent the data.

Example:

   1 def exec_before_process(inp_data, out_data, param_dict, tool):
   2     for name, data in out_data.items():
   3         data.name = 'New name' 

This custom code will change the name of the data that was created for this tool to New name. The difference between these two functions is that the exec_before_job executes before the page returns and the user will see the new name right away. If one were to use exec_before_process the new name would be set only once the job starts to execute.

Post-process code

This code executes after the background process running the tool finishes its run. The example below is more advanced one that replaces the type of the output data depending on the parameter named extension:

   1 from galaxy import datatypes
   2 def exec_after_process(app, inp_data, out_data, param_dict, tool, stdout, stderr):
   3     ext   = param_dict.get('extension', 'text')
   4     items = out_data.items()   
   5     for name, data in items: 
   6         newdata = datatypes.factory(ext)(id=data.id)
   7         for key, value in data.__dict__.items():
   8             setattr(newdata, key, value)
   9         newdata.ext = ext
  10         out_data[name] = newdata 

the content of stdout and stderr are strings containing the output of the process.