Locked History Actions

Diff for "Admin/Tools/WritingTests"

Differences between revisions 14 and 15
Revision 14 as of 2016-08-11 22:59:14
Size: 13480
Editor: DaveClements
Comment:
Revision 15 as of 2016-09-22 05:50:20
Size: 11159
Editor: DaveClements
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
## page was renamed from Admin/Tools/Writing Tests
#format text/creole
Line 4: Line 2:
==Writing Tests== == Writing Tests ==
Line 92: Line 90:
==Advanced Test Settings== == Advanced Test Settings ==
Line 96: Line 94:
====diff====
The default comparison method (//diff//) simply compares line by line in a file to check if the result of the test run of the tool matches the expected output specified in the {{{<output>}}} tag. A //lines_diff// attribute can be provided to allow the declared number of lines to differ between outputs. A 'change' in a line is equivalent to a count of 2 line differences: one line removed, one line added.
==== diff ====
The default comparison method (''diff'') simply compares line by line in a file to check if the result of the test run of the tool matches the expected output specified in the {{{<output>}}} tag. A ''lines_diff'' attribute can be provided to allow the declared number of lines to differ between outputs. A 'change' in a line is equivalent to a count of 2 line differences: one line removed, one line added.
Line 106: Line 104:
====re_match===
//re_match// is used to compare, line-by-line, the output from a tool test run to a file containing regular expression patterns. The helper script {{{scripts/tools/re_escape_output.py}}} can be used to turn an 'ordinary' output file into a regular expression escaped format. One can then edit the escaped file and replace content with the necessary regular expressions to match the variable output. //lines_diff// can also be optionally declared when using this matching style; in this case, files are matched line-by-line, so a 'change' in one line is equivalent to a //lines_diff// count of 1.

{{{
    <test>
      <output name="output" file="variable_output_file.bed" compare="re_match" lines_diff="1"/>
    </test>
}}}


====re_match_multiline===
//re_match_multiline// is used to compare the output from a tool test run to a file containing a multiline regular expression pattern. The helper script {{{scripts/tools/re_escape_output.py}}} can be used to turn an 'ordinary' output file into a regular expression escaped format (when -m/--multiline option is used). One can then edit the escaped file and replace content with the necessary regular expressions to match the variable output. //lines_diff// is not applicable when doing multiline regular expression matching.

{{{
    <test>
      <output name="output" file="variable_output_file.bed" compare="re_match_multiline" />
    </test>
}}}

When doing regular expression matching, this link maybe of interest: http://docs.python.org/library/re.html

====sim_size===
//sim_size// is used to compare the file size of output from a tool test run to a test file. The //delta// attribute is used to specify the maximum size difference, in bytes, allowed; default //delta// is 100.

{{{
    <test>
      <output name="output" file="variable_output_file.bed" compare="sim_size" delta="976245" />
    </test>
}}}

====contains===
//contains// can be used to check if the test file from your test-data folder is part of the output from a tool test run.

{{{
    <test>
        <output name="out_bam" file="empty_file.dat" compare="contains" />
    </test>
}}}

=== Checking extra_files_path contents ===
Line 170: Line 129:
Here four outputs are being tested; the first three files have no extra files, but the last file has 5 extra files (in addition to the primary file) which are being tested by the directory method. Each file in the specified directory of //output_html_file// will be tested against the files of the same name in the history item's extra files path. Here four outputs are being tested; the first three files have no extra files, but the last file has 5 extra files (in addition to the primary file) which are being tested by the directory method. Each file in the specified directory of ''output_html_file'' will be tested against the files of the same name in the history item's extra files path.
Line 185: Line 144:
==Beware of twill bug== == Beware of twill bug ==
Line 259: Line 218:
==Saving generated functional test output files== == Saving generated functional test output files ==
Line 263: Line 222:
If there is a variable called '<<nwwl(GALAXY_TEST_SAVE)>>' in the environment when tests are being run, each output file Twill generates that is compared with a reference file will be written to that directory - assuming write permissions and so on. For example: If there is a variable called 'GALAXY_TEST_SAVE' in the environment when tests are being run, each output file Twill generates that is compared with a reference file will be written to that directory - assuming write permissions and so on. For example:
Line 270: Line 229:
will test the individual tool with id 'myTool' and write the tested output files to /tmp/galtest. Running a full set of functional tests will of course result in a full set of test outputs being saved. To stop test outputs from being saved, reset <<nwwl(GALAXY_TOOL_SAVE)>> to null will test the individual tool with id 'myTool' and write the tested output files to /tmp/galtest. Running a full set of functional tests will of course result in a full set of test outputs being saved. To stop test outputs from being saved, reset GALAXY_TOOL_SAVE to null
Line 276: Line 235:
See also the environment variable <<nwwl(GALAXY_TEST_NO_CLEANUP)>> which disables automated removal of the test output files. See also the environment variable GALAXY_TEST_NO_CLEANUP which disables automated removal of the test output files.

Writing Tests

Preparing test for your Galaxy tool is easy. In short you include a sample input file and asample output file. Then you specify the parameters that with the given tool and input should produce the desired output.

Everybody benefits from a good testing - the tool author ensures quality of tool, admins can easily separate good tools from bad tools and users use tools that are reliable. An examples below explains how to write a test.

Tests can be specified in the tool config file using <tests> and test tags (for more information see description of test configuration tags. For example, the cluster tool specifies the following tests:

   1   <tests>
   2     <test>
   3       <param name="input1" value="5.bed" />
   4       <param name="distance" value="1" />
   5       <param name="minregions" value="2" />
   6       <param name="returntype" value="1" />
   7       <output name="output" file="gops-cluster-1.bed" />     
   8     </test>
   9     <test>
  10       <param name="input1" value="gops_cluster_bigint.bed" />
  11       <param name="distance" value="1" />
  12       <param name="minregions" value="2" />
  13       <param name="returntype" value="1" />
  14       <output name="output" file="gops-cluster-1.bed" />     
  15     </test>
  16     <test>
  17       <param name="input1" value="5.bed" />
  18       <param name="distance" value="1" />
  19       <param name="minregions" value="2" />
  20       <param name="returntype" value="2" />
  21       <output name="output" file="gops-cluster-2.bed" />     
  22     </test>    
  23     <test>
  24       <param name="input1" value="5.bed" />
  25       <param name="distance" value="1" />
  26       <param name="minregions" value="2" />
  27       <param name="returntype" value="3" />
  28       <output name="output" file="gops-cluster-3.bed" />     
  29     </test>
  30   </tests>

To explain what this means let's first take a look at the inputs and outputs of the cluster tool. It takes four inputs (input1, distance, minregions, and returntype) and produces a single output:

   1   <inputs>
   2     <param format="interval" name="input1" type="data">
   3       <label>Cluster intervals of</label>
   4     </param>
   5     <param name="distance" size="5" type="integer" value="1" help="(bp)">
   6       <label>max distance between intervals</label>
   7     </param>
   8     <param name="minregions" size="5" type="integer" value="2">
   9       <label>min number of intervals per cluster</label>
  10     </param>
  11         <param name="returntype" type="select" label="Return type">
  12                 <option value="1">Merge clusters into single intervals</option>
  13                 <option value="2">Find cluster intervals; preserve comments and order</option>
  14                 <option value="3">Find cluster intervals; output grouped by clusters</option>
  15                 <option value="4">Find the smallest interval in each cluster</option>
  16                 <option value="5">Find the largest interval in each cluster</option>
  17         </param>
  18    </inputs>
  19   <outputs>
  20     <data format="input" name="output" metadata_source="input1" />
  21   </outputs>

Now let's take a look at the first test:

   1     <test>
   2       <param name="input1" value="5.bed" />
   3       <param name="distance" value="1" />
   4       <param name="minregions" value="2" />
   5       <param name="returntype" value="1" />
   6       <output name="output" file="gops-cluster-1.bed" />     
   7     </test>

All this does is specify parameters that will be used by test framework to run this test. For most input types, the value should be what would be entered by the user when running the tool through the web, with the exception of input and output. The input (5.bed) and output (gops-cluster-1.bed) files reside within the ~/test-data directory. Once the test is executed the framework simply compares generated output with an example file (gops-cluster-1.bed in this case). If there are no differences - test is declared success.

To run the Galaxy functional tests see Running Tests.


Advanced Test Settings

Output File Comparison Methods

diff

The default comparison method (diff) simply compares line by line in a file to check if the result of the test run of the tool matches the expected output specified in the <output> tag. A lines_diff attribute can be provided to allow the declared number of lines to differ between outputs. A 'change' in a line is equivalent to a count of 2 line differences: one line removed, one line added.

    <test>
      <output name="output" file="variable_output_file.bed" lines_diff="10"/>     
    </test>

Several tools, including those that use Composite Datatypes such as rGenetics, create additional files which are stored in a directory associated with the main history item. If you have a tool that creates these extra files, it is a good idea to write tests which also verify their correctness. This can be done on a per extra file basis or by comparing an entire directory; all of the previously mentioned comparison methods are applicable.

The two examples below are from tools/peak_calling/macs_wrapper.xml.

File-by-file comparison

Here two outputs are being tested; the first file has no extra files, but the second file has five extra files (in addition to the primary file) which are being tested.

    <test>
      <output name="output_bed_file" file="peakcalling_macs/macs_test_1_out.bed" />
      <output name="output_html_file" file="peakcalling_macs/macs_test_1_out.html" compare="re_match" >
        <extra_files type="file" name="Galaxy_Test_Run_model.pdf" value="peakcalling_macs/test2/Galaxy_Test_Run_model.pdf" compare="re_match"/>
        <extra_files type="file" name="Galaxy_Test_Run_model.r" value="peakcalling_macs/test2/Galaxy_Test_Run_model.r" compare="re_match"/>
        <extra_files type="file" name="Galaxy_Test_Run_model.r.log" value="peakcalling_macs/test2/Galaxy_Test_Run_model.r.log"/>
        <extra_files type="file" name="Galaxy_Test_Run_negative_peaks.xls" value="peakcalling_macs/test2/Galaxy_Test_Run_negative_peaks.xls" compare="re_match"/>
        <extra_files type="file" name="Galaxy_Test_Run_peaks.xls" value="peakcalling_macs/test2/Galaxy_Test_Run_peaks.xls" compare="re_match"/>
      </output>
    </test>

Directory comparison

Here four outputs are being tested; the first three files have no extra files, but the last file has 5 extra files (in addition to the primary file) which are being tested by the directory method. Each file in the specified directory of output_html_file will be tested against the files of the same name in the history item's extra files path.

    <test>
      <output name="output_bed_file" file="peakcalling_macs/macs_test_1_out.bed" />
      <output name="output_xls_to_interval_peaks_file" file="peakcalling_macs/macs_test_2_peaks_out.interval" lines_diff="4" />
      <output name="output_xls_to_interval_negative_peaks_file" file="peakcalling_macs/macs_test_2_neg_peaks_out.interval" />
      <output name="output_html_file" file="peakcalling_macs/macs_test_1_out.html" compare="re_match" >
        <extra_files type="directory" value="peakcalling_macs/test2/" compare="re_match"/>
      </output>
    </test>


Beware of twill bug

See the following e-mail for explanation of a workaround that deals with "dashed" options:

Hello Assaf,

This is a known bug in twill 0.9.  The work-around is to use the label rather than the value in your functional test.  So, in your example, the test should be changed to the following.  Let me know if this does not work.

One of the tests looks like this:
-------------
<test>
  <!-- ASCII to NUMERIC -->
  <param name="input" value="fastq_qual_conv1.fastq" />
  <param name="QUAL_FORMAT" value="Numeric quality scores" />
  <output name="output" file="fastq_qual_conv1.out" />
</test>
-------------

Greg Von Kuster
Galaxy Development Team


Assaf Gordon wrote:
Hello,

I wrote a functional test for my tool, and encountered a strange behavior.

One of the tool's parameters looks like this:
-------------
<param name="QUAL_FORMAT" type="select" label="output format">
     <option value="-a">ASCII (letters) quality scores</option>
     <option value="-n">Numeric quality scores</option>
</param>
------------

One of the tests looks like this:
-------------
<test>
   <!-- ASCII to NUMERIC -->
   <param name="input" value="fastq_qual_conv1.fastq" />
   <param name="QUAL_FORMAT" value="-n" />
   <output name="output" file="fastq_qual_conv1.out" />
</test>
-------------


When I run the functional tests for this tool, I get the following exception:
---------------
Traceback (most recent call last):
File "galaxy_devel_tools/test/functional/test_toolbox.py", line 114, in test_tool
    self.do_it()
File "galaxy_devel_tools/test/functional/test_toolbox.py", line 44, in do_it
    self.run_tool( self.testdef.tool.id, repeat_name=repeat_name, **page_inputs )
File "galaxy_devel_tools/test/base/twilltestcase.py", line 520, in run_tool
    self.submit_form( **kwd )
  File "galaxy_devel_tools/test/base/twilltestcase.py", line 495, in submit_form
    raise AssertionError( errmsg )
AssertionError: Attempting to set field 'QUAL_FORMAT' to value '['-n']' in form 'tool_form' threw exception: cannot find value/label "n" in list control
control: <SelectControl(QUAL_FORMAT=[-a, -n])>
---------------

If I understand the exception correctly, it means that somewhere the minus character ("-n") gets dropped, and therefor the value 'n' cannot be found in the list (which contains "-n" and "-a").



Is this an actual bug or am I doing something wrong?

Thanks,
   Gordon.


Saving generated functional test output files

A small change to the test framework was introduced in April 2011 allowing test outputs generated by Twill during functional tests to be saved, making it easier to update test expected outputs after changes to a tool.

If there is a variable called 'GALAXY_TEST_SAVE' in the environment when tests are being run, each output file Twill generates that is compared with a reference file will be written to that directory - assuming write permissions and so on. For example:

setenv GALAXY_TEST_SAVE /tmp/galtest
sh run_functional_tests.sh -id myTool

will test the individual tool with id 'myTool' and write the tested output files to /tmp/galtest. Running a full set of functional tests will of course result in a full set of test outputs being saved. To stop test outputs from being saved, reset GALAXY_TOOL_SAVE to null

setenv GALAXY_TEST_SAVE

See also the environment variable GALAXY_TEST_NO_CLEANUP which disables automated removal of the test output files.