Locked History Actions

ToolShed/Contributions/2016_03

Galaxy ToolShed

Tools contributed to the Galaxy Project Tool Shed in March 2016.

New Tools

repository_suite_definition

  • From bornea:

    • apostl_tools: APOSTL is an interactive affinity proteomics analysis software developed to reformat affinity proteomics data (both spectral counting and MS1) for input into the SAINTexpress statistical package and to visualize the output(s).

unrestricted

  • From youngkim:

    • ezbamqc: Quality control tool for NGS mapping files. ezBAMQC is a tool to check the quality of either one or many mapped next-generation-sequencing datasets. It conducts comprehensive evaluations of aligned sequencing data from multiple aspects including: clipping profile, mapping quality distribution, mapped read length distribution, genomic/transcriptomic mapping distribution, inner distance distribution (for paired-end reads), ribosomal RNA contamination, transcript 5\u2019 and 3\u2019 end bias, transcription dropout rate, sample correlations, sample reproducibility, sample variations. It outputs a set of tables and plots and one HTML page that contains a summary of the results. Many metrics are designed for RNA-seq data specifically, but ezBAMQC can be applied to any mapped sequencing dataset such as RNA-seq, CLIP-seq, GRO-seq, ChIP-seq, DNA-seq and so on.

  • From alenail:

  • From dcorreia:

    • newick_display: Newick display from Newick utilities

    • noisy: Noisy Identification of problematic columns in multiple sequence alignments

  • From bornea:

  • From iuc:

  • From drosofff:

    • sam_to_fastq: extracts reads and their sequence quality from a SAM alignment file and returns a fastq file

  • From bgruening:

    • protease_prediction: This tool can learn the cleavage specificity of a given class of proteases. This tool can learn the cleavage specificity of a given class of protease. In a second step this can be used to predict proteases given a cleavage site. The method assumes that the candidate cleavage point is between the two amino acids adjacent to the central position.

  • From adefelicibus:

  • From chrisb:

  • From nanettec:

    • hotspots: Identify eQTL hotspots using permutation threshold and chi-squared test ### This is the sixth tool in the eQTL backend pipeline:

      • lookup, classification, frequency, sliding window frequency, hotspots, GO enrichment
      • Link to the workflow (for import into Galaxy): http://chewbacca.bi.up.ac.za:8080/u/nanette/w/back-end-workflow-2

      • Identify the max number of eQTL expected by chance per cM using a permutation approach.
      • Eliminate differential gene density as an explanatory factor for eQTL hotspots, by performing a chi-squared test per bin.
      • Calculate the proportion of genes to eQTLs, use this as the population estimates and test the null hypothesis that the number of genes and eQTLs in each interval is consistent.
      • Mark bins where the expected number (genes + eQTLs) of every interval is not 5 or more (assumption for chi-squared test). For these bins the chi-squared test cannot be performed.
      • Extract lists of eQTLs linked to each unbiased eQTL hotspot.
      • Genome wide eQTL freqeuncy plots.
    • go_enrichment: GO enrichment of eQTL hotspot gene lists ### This is the last tool in the eQTL backend pipeline:

    • integrate_15: Integrates eQTL results after 15 parallel runs ### This is the last tool in the eQTL mapper workflow:

    • split_15: Split e-traits into 15 inp files ### This is the first tool in the eQTL mapper workflow:

      • split_15, qtlmap_15, save_z_15, integrate_15
      • Link to the workflow (for import into Galaxy): http://chewbacca.bi.up.ac.za:8080/u/nanette/w/15-parallel

      • The input e-trait data are split into 15 .inp files (each containing the same number of e-traits).
      • This makes 15 parallel runs possible, to reduce the running time of large eQTL datasets.
    • qtlmap_15: Run QTL Cartographer and parse the results ### This is the second tool in the eQTL mapper pipeline:

      • split_15, qtlmap_15, save_z_15, integrate_15
      • Link to the workflow (for import into Galaxy): http://chewbacca.bi.up.ac.za:8080/u/nanette/w/15-parallel

      • QTL Cartographer is employed for eQTL mapping. The results are parsed.
      • This tool must be executed 15 times in parallel; every time with a different .inp input file (every time the .map and parameters.txt input files are used). Each execution will produce one .txt output file.
    • save_z_15: Run QTL Cartographer and save partial z file ### This is the third tool in the eQTL mapper workflow:

    • split_15, qtlmap_15, save_z_15, integrate_15
    • Link to the workflow (for import into Galaxy): http://chewbacca.bi.up.ac.za:8080/u/nanette/w/15-parallel

    • QTL Cartographer is employed for eQTL mapping.
    • A partial z file is saved; this is an input file for the eQTL backend pipeline.
    • frequency: Frequency of eQTLs and genes ### This is the third tool in the eQTL backend pipeline:

    • lookup: Lookup table for cM intervals ### This is the first tool in the eQTL backend pipeline:

      • lookup, classification, frequency, sliding window frequency, hotspots, GO enrichment
      • Link to the workflow (for import into Galaxy): http://chewbacca.bi.up.ac.za:8080/u/nanette/w/back-end-workflow-2

      • The information from the Markers file and the QTL Cartographer Z file, are combined to proportionally estimate a base pair position at each \u201cQTL Cartographer bin\u201d (e.g. 2 cM intervals).
      • The Lookup file can then serve as a lookup table to convert between base pair and centimorgan positions.
    • frequency_sliding: Sliding Window frequency of eQTLs and genes ### This is the fourth tool in the eQTL backend pipeline:

      • lookup, classification, frequency, sliding window frequency, hotspots, GO enrichment
      • Link to the workflow (for import into Galaxy): http://chewbacca.bi.up.ac.za:8080/u/nanette/w/back-end-workflow-2

      • Combine x cM intervals (size of lookup bins; for example 2 cM), to be used in a sliding window approach.
      • For 2 cM lookup bins:
        • For two intervals per sliding window, intervals smaller than 2 cM are combined with its two flanking 2 cM intervals.
        • Calculate the number of eQTLs per sliding window (4 - 5.9 cM intervals).
        • Calculate the number of genes per sliding window (4 - 5.9 cM intervals).
      • For three intervals per sliding window, intervals smaller than 2 cM are combined with 3 flanking 2 cM intervals.
        • Calculate the number of eQTLs per sliding window (6 - 7.9 cM intervals).
        • Calculate the number of genes per sliding window (6 - 7.9 cM intervals).
    • classifier: Classify eQTLs as cis or trans ### This is the second tool in the eQTL backend pipeline:

      • lookup, classification, frequency, sliding window frequency, hotspots, GO enrichment
      • Link to the workflow (for import into Galaxy): http://chewbacca.bi.up.ac.za:8080/u/nanette/w/back-end-workflow-2

      • Calculates the average genetic interval size across all eQTLs.
      • Classifies an eQTL as 'cis' if it maps within half the above mentioned interval size of the gene exhibiting the eQTL.
      • Classifies an eQTL as 'trans' if it maps to a different region on the genome than the location of the gene exhibiting the eQTL (further away than half the above mentioned interval size from the gene).
      • Classifies an eQTL as 'no_result' if the location of the target gene is not known.
  • From chrisd:

    • amrplusplus_workflow: workflow for analyzing metagenomic sequence data.

    • snipfinder: Snip caller for single and paired-end alignments. This is a snip caller for single and paired-end alignments. For each alignment (or pair of alignments), snips are recorded and represented as a single haplotype.

    • coverage_sampler: Calculates the amount of a gene that is covered from a sample of reads

  • From aafc-mbb:

    • itsx: ITSx -- Identifies ITS sequences and extracts the ITS region ITSx is an open source software utility to extract the highly variable ITS1 and ITS2 subregions from ITS sequences, which is commonly used as a molecular barcode for e.g. fungi. As the inclusion of parts of the neighbouring, very conserved, ribosomal genes (SSU, 5S and LSU rRNA sequences) in the sequence identification process can lead to severely misleading results, ITSx identifies and extracts only the ITS regions themselves. For more information regarding the settings of the tool, please visit the ITSx Users Guide on http://microbiology.se/publ/itsx_users_guide.

    • quast: QUAST (Quality Assessment Tool) evaluates genome assemblies. Quast stands for Quality Assessment Tool. It evalutes genome assemblies by computing various metrics.

  • From abretaud:

  • From yhoogstrate:

  • From galaxyp:

    • uniprotxml_downloader: Download UniProt proteome in XML format. The Morpheus proteomics search engine uses the uniprotxml format.

tool_dependency_definition