RCAC Biocontainers documentation!

This is the user guide for biocontainer modules deployed in Purdue High Performance Computing clusters. More information about our center is avaiable here (https://www.rcac.purdue.edu).
If you have any question, contact me(Yucheng Zhang) at: zhan4429@purdue.edu
Warning
Do not use both bioinfo
and biocontainers
in your job script, because loading bioinfo
will cause the failure of loading many modules including biocontainers
in Brown
, Halstead
, Scholar
, Workbench
, and Gilbreth
. Since RCAC will not provide support to bioinfo
in the future clusters, we recommend users to just use biocontainers
.
Frequently Asked Questions
- What are the advantages of using biocontainers
Biocontainers are based on the popular container techonlogy. Due to their ease of deployment and portability, RCAC can deploy a large number of bioinformatic applications on our clusters, and keep adding newer versions. In addition, containerized applications can help improve reproductivity of scientists’ research. Using biocontainers, you can generate the same results no matter which cluster you are using, and no matter whether you run the program today or 10 years later.
- Can we use both bioinfo and biocontainers in our job script?
No. If you load bioinfo, you will find that you cannot load biocontainers. This is a legacy issue, and all clusters are affected except Bell. So you can use either bioinfo or biocontainers in your job script, just do not use both.
- How should I load biocontainers after I load bioinfo? The error message shows “biocontainers” is unknown.
- Run below commands:
module purge
module load modtree/new
module load biocontainers
- I cannot find the path to executables by
which
? Biocontainers’ exectuables are located inside containers instead of the host system of cluster. The commands we provide are actually alias to
singularity exec /apps/biocontainers/images/image.sif command
. For example, theblastp
command you use is actuallysingularity exec /apps/biocontainers/images/blast.sif blastp
. For applications requiring users to provide exectuable path such asRSEM
andMAKER
, please check their specific user guides we provide.
Singularity
Note: Singularity was originally a project out of Lawrence Berkeley National Laboratory. It has now been spun off into a distinct offering under a new corporate entity under the name Sylabs Inc. This guide pertains to the open source community edition, SingularityCE.
What is Singularity?
Singularity is a new feature of the Community Clusters allowing the portability and reproducibility of operating system and application environments through the use of Linux containers. It gives users complete control over their environment.
Singularity is like Docker but tuned explicitly for HPC clusters. More information is available from the project’s website.
Features
Run the latest applications on an Ubuntu or Centos userland
Gain access to the latest developer tools
Launch MPI programs easily
Much more
Singularity’s user guide is available at: sylabs.io/guides/3.8/user-guide
Example
Here is an example using an Ubuntu 16.04 image on Weber:
singularity exec /depot/itap/singularity/ubuntu1604.img cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04 LTS"
Here is another example using a Centos 7 image:
singularity exec /depot/itap/singularity/centos7.img cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
Purdue Cluster Specific Notes
All service providers will integrate Singularity slightly differently depending on site. The largest customization will be which default files are inserted into your images so that routine services will work.
Services we configure for your images include DNS settings and account information. File systems we overlay into your images are your home directory, scratch, Data Depot, and application file systems.
Here is a list of paths:
/etc/resolv.conf
/etc/hosts
/home/$USER
/apps
/scratch
/depot
This means that within the container environment these paths will be present and the same as outside the container. The /apps
, /scratch
, and /depot
directories will need to exist inside your container to work properly.
Creating Singularity Images
Due to how singularity containers work, you must have root privileges to build an image. Once you have a singularity container image built on your own system, you can copy the image file up to the cluster (you do not need root privileges to run the container).
You can find information and documentation for how to install and use singularity on your system:
We have version 3.8.0-1.el7
on the cluster. You will most likely not be able to run any container built with any singularity past that version. So be sure to follow the installation guide for version 3.8 on your system:
singularity --version
singularity version 3.8.0-1.el7
Everything you need on how to build a container is available from their user-guide. Below are merely some quick tips for getting your own containers built for Weber.
You can use a Definition File to both build your container and share its specification with collaborators (for the sake of reproducibility). Here is a simplistic example of such a file:
# FILENAME: Buildfile
Bootstrap: docker
From: ubuntu:18.04
%post
apt-get update && apt-get upgrade -y
mkdir /apps /depot /scratch
To build the image itself:
sudo singularity build ubuntu-18.04.sif Buildfile
The challenge with this approach however is that it must start from scratch if you decide to change something. In order to create a container image iteratively and interactively, you can use the --sandbox
option:
sudo singularity build --sandbox ubuntu-18.04 docker://ubuntu:18.04
This will not create a flat image file but a directory tree (i.e., a folder), the contents of which are the container’s filesystem. In order to get a shell inside the container that allows you to modify it, user the --writable
option:
sudo singularity shell --writable ubuntu-18.04
Singularity: Invoking an interactive shell within container...
Singularity ubuntu-18.04.sandbox:~>
You can then proceed to install any libraries, software, etc. within the container. Then to create the final image file, exit
the shell and call the build
command once more on the sandbox:
sudo singularity build ubuntu-18.04.sif ubuntu-18.04
Finally, copy the new image to Weber and run it.
Abacas
Introduction
Abacas
is a tool for algorithm based automatic contiguation of assembled sequences.
Versions
1.3.1
Commands
abacas.pl
abacas.1.3.1.pl
Module
You can load the modules by:
module load biocontainers
module load abacas
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Abacas on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=abacas
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abacas
abacas.pl -r cmm.fasta -q Cm.contigs.fasta -p nucmer -o out_prefix
Abismal
Introduction
Another Bisulfite Mapping Algorithm (abismal) is a read mapping program for bisulfite sequencing in DNA methylation studies.
Versions
3.0.0
Commands
abismal
abismalidx
simreads
Module
You can load the modules by:
module load biocontainers
module load abismal
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run abismal on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=abismal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abismal
abismalidx ~/.local/share/genomes/hg38/hg38.fa hg38
Abpoa
Introduction
abPOA: adaptive banded Partial Order Alignment
Versions
1.4.1
Commands
abpoa
Module
You can load the modules by:
module load biocontainers
module load abpoa
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run abpoa on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=abpoa
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abpoa
abpoa seq.fa > cons.fa
Abricate
Introduction
Abricate
is a tool for mass screening of contigs for antimicrobial resistance or virulence genes.
Versions
1.0.1
Commands
abricate
Module
You can load the modules by:
module load biocontainers
module load abricate
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Abricate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=abricate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abricate
abricate --threads 8 *.fasta
Abyss
Introduction
ABySS
is a de novo sequence assembler intended for short paired-end reads and genomes of all sizes.
Versions
2.3.2
2.3.4
Commands
ABYSS
ABYSS-P
AdjList
Consensus
DAssembler
DistanceEst
DistanceEst-ssq
KAligner
MergeContigs
MergePaths
Overlap
ParseAligns
PathConsensus
PathOverlap
PopBubbles
SimpleGraph
abyss-align
abyss-bloom
abyss-bloom-dbg
abyss-bowtie
abyss-bowtie2
abyss-bwa
abyss-bwamem
abyss-bwasw
abyss-db-txt
abyss-dida
abyss-fac
abyss-fatoagp
abyss-filtergraph
abyss-fixmate
abyss-fixmate-ssq
abyss-gapfill
abyss-gc
abyss-index
abyss-junction
abyss-kaligner
abyss-layout
abyss-longseqdist
abyss-map
abyss-map-ssq
abyss-mergepairs
abyss-overlap
abyss-paired-dbg
abyss-paired-dbg-mpi
abyss-pe
abyss-rresolver-short
abyss-samtoafg
abyss-scaffold
abyss-sealer
abyss-stack-size
abyss-tabtomd
abyss-todot
abyss-tofastq
konnector
logcounter
Module
You can load the modules by:
module load biocontainers
module load abyss
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run abyss on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=abyss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abyss
abyss-pe np=4 k=25 name=test B=1G \
in='test-data/reads1.fastq test-data/reads2.fastq'
Actc
Introduction
Actc is used to align subreads to ccs reads.
Versions
0.2.0
Commands
actc
Module
You can load the modules by:
module load biocontainers
module load actc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run actc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=actc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers actc
actc subreads.bam ccs.bam subreads_to_ccs.bam
Adapterremoval
Introduction
AdapterRemoval searches for and removes adapter sequences from High-Throughput Sequencing (HTS) data and (optionally) trims low quality bases from the 3’ end of reads following adapter removal. AdapterRemoval can analyze both single end and paired end data, and can be used to merge overlapping paired-ended reads into (longer) consensus sequences. Additionally, AdapterRemoval can construct a consensus adapter sequence for paired-ended reads, if which this information is not available.
Versions
2.3.3
Commands
AdapterRemoval
Module
You can load the modules by:
module load biocontainers
module load adapterremoval
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run adapterremoval on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=adapterremoval
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers adapterremoval
AdapterRemoval --file1 input_1.fastq --file2 input_2.fastq
Advntr
Introduction
Advntr
is a tool for genotyping Variable Number Tandem Repeats (VNTR) from sequence data.
Versions
1.4.0
1.5.0
Commands
advntr
Module
You can load the modules by:
module load biocontainers
module load advntr
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Advntr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=advntr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers advntr
advntr addmodel -r chr21.fa -p CGCGGGGCGGGG -s 45196324 -e 45196360 -c chr21
advntr genotype --vntr_id 1 --alignment_file CSTB_2_5_testdata.bam --working_directory working_dir
Afplot
Introduction
Afplot
is a tool to plot allele frequencies in VCF files.
Versions
0.2.1
Commands
afplot
Module
You can load the modules by:
module load biocontainers
module load afplot
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run afplot on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=afplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers afplot
afplot whole-genome histogram -v my_vcf.gz -l my_label -s my_sample -o mysample.histogram.png
Afterqc
Introduction
Afterqc
is a tool for quality control of FASTQ data produced by HiSeq 2000/2500/3000/4000, Nextseq 500/550, MiniSeq, and Illumina 1.8 or newer.
Versions
0.9.7
Commands
after.py
Module
You can load the modules by:
module load biocontainers
module load afterqc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run blobtools on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=afterqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers afterqc
after.py -1 SRR11941281_1.fastq.paired.fq -2 SRR11941281_2.fastq.paired.fq
Agat
Introduction
Agat
is a suite of tools to handle gene annotations in any GTF/GFF format.
Versions
0.8.1
Commands
agat_convert_bed2gff.pl
agat_convert_embl2gff.pl
agat_convert_genscan2gff.pl
agat_convert_mfannot2gff.pl
agat_convert_minimap2_bam2gff.pl
agat_convert_sp_gff2bed.pl
agat_convert_sp_gff2gtf.pl
agat_convert_sp_gff2tsv.pl
agat_convert_sp_gff2zff.pl
agat_convert_sp_gxf2gxf.pl
agat_sp_Prokka_inferNameFromAttributes.pl
agat_sp_add_introns.pl
agat_sp_add_start_and_stop.pl
agat_sp_alignment_output_style.pl
agat_sp_clipN_seqExtremities_and_fixCoordinates.pl
agat_sp_compare_two_BUSCOs.pl
agat_sp_compare_two_annotations.pl
agat_sp_complement_annotations.pl
agat_sp_ensembl_output_style.pl
agat_sp_extract_attributes.pl
agat_sp_extract_sequences.pl
agat_sp_filter_by_ORF_size.pl
agat_sp_filter_by_locus_distance.pl
agat_sp_filter_by_mrnaBlastValue.pl
agat_sp_filter_feature_by_attribute_presence.pl
agat_sp_filter_feature_by_attribute_value.pl
agat_sp_filter_feature_from_keep_list.pl
agat_sp_filter_feature_from_kill_list.pl
agat_sp_filter_gene_by_intron_numbers.pl
agat_sp_filter_gene_by_length.pl
agat_sp_filter_incomplete_gene_coding_models.pl
agat_sp_filter_record_by_coordinates.pl
agat_sp_fix_cds_phases.pl
agat_sp_fix_features_locations_duplicated.pl
agat_sp_fix_fusion.pl
agat_sp_fix_longest_ORF.pl
agat_sp_fix_overlaping_genes.pl
agat_sp_fix_small_exon_from_extremities.pl
agat_sp_flag_premature_stop_codons.pl
agat_sp_flag_short_introns.pl
agat_sp_functional_statistics.pl
agat_sp_keep_longest_isoform.pl
agat_sp_kraken_assess_liftover.pl
agat_sp_list_short_introns.pl
agat_sp_load_function_from_protein_align.pl
agat_sp_manage_IDs.pl
agat_sp_manage_UTRs.pl
agat_sp_manage_attributes.pl
agat_sp_manage_functional_annotation.pl
agat_sp_manage_introns.pl
agat_sp_merge_annotations.pl
agat_sp_prokka_fix_fragmented_gene_annotations.pl
agat_sp_sensitivity_specificity.pl
agat_sp_separate_by_record_type.pl
agat_sp_statistics.pl
agat_sp_webApollo_compliant.pl
agat_sq_add_attributes_from_tsv.pl
agat_sq_add_hash_tag.pl
agat_sq_add_locus_tag.pl
agat_sq_count_attributes.pl
agat_sq_filter_feature_from_fasta.pl
agat_sq_list_attributes.pl
agat_sq_manage_IDs.pl
agat_sq_manage_attributes.pl
agat_sq_mask.pl
agat_sq_remove_redundant_entries.pl
agat_sq_repeats_analyzer.pl
agat_sq_rfam_analyzer.pl
agat_sq_split.pl
agat_sq_stat_basic.pl
Module
You can load the modules by:
module load biocontainers
module load agat
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Agat on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=agat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers agat
agat_convert_sp_gff2bed.pl --gff genes.gff -o genes.bed
Agfusion
Introduction
AGFusion (pronounced ‘A G Fusion’) is a python package for annotating gene fusions from the human or mouse genomes.
Versions
1.3.11
Commands
agfusion
Module
You can load the modules by:
module load biocontainers
module load agfusion
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run agfusion on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=agfusion
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers agfusion
Alfred
Introduction
Alfred
is an efficient and versatile command-line application that computes multi-sample quality control metrics in a read-group aware manner.
Versions
0.2.5
0.2.6
Commands
alfred
Module
You can load the modules by:
module load biocontainers
module load alfred
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Alfred on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=alfred
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alfred
alfred qc -r genome.fasta -o qc.tsv.gz sorted.bam
Alien-hunter
Introduction
Alien-hunter
is an application for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs).
Versions
1.7.7
Commands
alien_hunter
Module
You can load the modules by:
module load biocontainers
module load alien_hunter
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Alien_hunter on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=alien_hunter
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alien_hunter
alien_hunter genome.fasta output
Alignstats
Introduction
AlignStats produces various alignment, whole genome coverage, and capture coverage metrics for sequence alignment files in SAM, BAM, and CRAM format.
Versions
0.9.1
Commands
alignstats
Module
You can load the modules by:
module load biocontainers
module load alignstats
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run alignstats on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=alignstats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alignstats
alignstats -C -i input.bam -o report.txt
Allpathslg
Introduction
Allpathslg
is a whole-genome shotgun assembler that can generate high-quality genome assemblies using short reads.
Versions
52488
Commands
PrepareAllPathsInputs.pl
RunAllPathsLG
CacheLibs.pl
Fasta2Fastb
Module
You can load the modules by:
module load biocontainers
module load allpathslg
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Allpathslg on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=allpathslg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers allpathslg
PrepareAllPathsInputs.pl \
DATA_DIR=data \
PLOIDY=1 \
IN_GROUPS_CSV=in_groups.csv\
IN_LIBS_CSV=in_libs.csv\
OVERWRITE=True\
RunAllPathsLG PRE=allpathlg REFERENCE_NAME=test.genome \
DATA_SUBDIR=data RUN=myrun TARGETS=standard \
SUBDIR=test OVERWRITE=True
~
Alphafold
Introduction
Alphafold
is a protein structure prediction tool developed by DeepMind (Google). It uses a novel machine learning approach to predict 3D protein structures from primary sequences alone. The source code is available on Github. It has been deployed in all RCAC clusters, supporting both CPU and GPU.
It also relies on a huge database. The full database (~2.2TB) has been downloaded and setup for users.
Protein struction prediction by alphafold is performed in the following steps:
Search the amino acid sequence in uniref90 database by jackhmmer (using CPU)
Search the amino acid sequence in mgnify database by jackhmmer (using CPU)
Search the amino acid sequence in pdb70 database (for monomers) or pdb_seqres database (for multimers) by hhsearch (using CPU)
Search the amino acid sequence in bfd database and uniclust30 (updated to uniref30 since v2.3.0) database by hhblits (using CPU)
Search structure templates in pdb_mmcif database (using CPU)
Search the amino acid sequence in uniprot database (for multimers) by jackhmmer (using CPU)
Predict 3D structure by machine learning (using CPU or GPU)
Structure optimisation with OpenMM (using CPU or GPU)
Versions
2.1.1
2.2.0
2.2.3
2.3.0
2.3.1
Commands
run_alphafold.sh
Module
You can load the modules by:
module load biocontainers
module load alphafold
Usage
The usage of Alphafold on our cluster is very straightford, users can create a flagfile containing the database path information:
run_alphafold.sh --flagfile=full_db.ff --fasta_paths=XX --output_dir=XX ...
Users can check its detaied user guide in its Github.
full_db.ff
Example contents of full_db.ff:
--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db/
--uniref90_database_path=/depot/itap/datasets/alphafold/db/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db/mgnify/mgy_clusters_2018_12.fa
--uniclust30_database_path=/depot/itap/datasets/alphafold/db/uniclust30/uniclust30_2018_08/uniclust30_2018_08
--pdb70_database_path=/depot/itap/datasets/alphafold/db/pdb70/pdb70
--template_mmcif_dir=/depot/itap/datasets/alphafold/db/pdb_mmcif/mmcif_files
--max_template_date=2022-01-29
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign
Note
Since Version v2.2.0, the AlphaFold-Multimer model parameters has been updated. The updated full database is stored in depot/itap/datasets/alphafold/db_20221014
. For ACCESS Anvil, the database is stored in /anvil/datasets/alphafold/db_20221014
. Users need to update the flagfile using the updated database:
run_alphafold.sh --flagfile=full_db_20221014.ff --fasta_paths=XX --output_dir=XX ...
full_db_20221014.ff (for alphafold v2)
Example contents of full_db_20221014.ff (For ACCESS Anvil, please change depot/itap
to anvil
):
--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db_20221014/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db_20221014/
--uniref90_database_path=/depot/itap/datasets/alphafold/db_20221014/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db_20221014/mgnify/mgy_clusters_2018_12.fa
--uniclust30_database_path=/depot/itap/datasets/alphafold/db_20221014/uniclust30/uniclust30_2018_08/uniclust30_2018_08
--pdb_seqres_database_path=/depot/itap/datasets/alphafold/db_20221014/pdb_seqres/pdb_seqres.txt
--uniprot_database_path=/depot/itap/datasets/alphafold/db_20221014/uniprot/uniprot.fasta
--template_mmcif_dir=/depot/itap/datasets/alphafold/db_20221014/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db_20221014/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign
Note
Since Version v2.3.0, the AlphaFold-Multimer model parameters has been updated. The updated full database is stored in depot/itap/datasets/alphafold/db_20230311
. For ACCESS Anvil, the database is stored in /anvil/datasets/alphafold/db_20230311
. Users need to update the flagfile using the updated database:
run_alphafold.sh --flagfile=full_db_20230311.ff --fasta_paths=XX --output_dir=XX ...
Note
Since Version v2.3.0, uniclust30_database_path
has been changed to uniref30_database_path
.
full_db_20230311.ff (for alphafold v3)
Example contents of full_db_20230311.ff for monomer (For ACCESS Anvil, please change depot/itap
to anvil
):
--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db_20230311/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db_20230311/
--uniref90_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db_20230311/mgnify/mgy_clusters_2022_05.fa
--uniref30_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref30/UniRef30_2021_03
--pdb70_database_path=/depot/itap/datasets/alphafold/db_20230311/pdb70/pdb70
--template_mmcif_dir=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign
Example contents of full_db_20230311.ff for multimer (For ACCESS Anvil, please change depot/itap
to anvil
):
--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db_20230311/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db_20230311/
--uniref90_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db_20230311/mgnify/mgy_clusters_2022_05.fa
--uniref30_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref30/UniRef30_2021_03
--pdb_seqres_database_path=/depot/itap/datasets/alphafold/db_20230311/pdb_seqres/pdb_seqres.txt
--uniprot_database_path=/depot/itap/datasets/alphafold/db_20230311/uniprot/uniprot.fasta
--template_mmcif_dir=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign
Example job using CPU
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
Notice that since version 2.2.0, the parameter --use_gpu_relax=False
is required.
To run alphafold using CPU:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=alphafold
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alphafold/2.3.1
run_alphafold.sh --flagfile=full_db_20230311.ff \
--fasta_paths=sample.fasta --max_template_date=2022-02-01 \
--output_dir=af2_full_out --model_preset=monomer \
--use_gpu_relax=False
Example job using GPU
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
Notice that since version 2.2.0, the parameter --use_gpu_relax=True
is required.
To run alphafold using GPU:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 11
#SBATCH --gres=gpu:1
#SBATCH --job-name=alphafold
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alphafold/2.3.1
run_alphafold.sh --flagfile=full_db_20230311.ff \
--fasta_paths=sample.fasta --max_template_date=2022-02-01 \
--output_dir=af2_full_out --model_preset=monomer \
--use_gpu_relax=True
Amptk
Introduction
Amptk
is a series of scripts to process NGS amplicon data using USEARCH and VSEARCH, it can also be used to process any NGS amplicon data and includes databases setup for analysis of fungal ITS, fungal LSU, bacterial 16S, and insect COI amplicons.
Versions
1.5.4
Commands
amptk
Module
You can load the modules by:
module load biocontainers
module load amptk
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Amptk on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=amptk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers amptk
amptk illumina -i test_data/illumina_test_data -o miseq -f fITS7 -r ITS4 --cpus 4
Ananse
Introduction
ANANSE is a computational approach to infer enhancer-based gene regulatory networks (GRNs) and to identify key transcription factors between two GRNs.
Versions
0.4.0
Commands
ananse
Module
You can load the modules by:
module load biocontainers
module load ananse
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ananse on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ananse
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ananse
mkdir -p ANANSE.REMAP.model.v1.0
wget https://zenodo.org/record/4768075/files/ANANSE.REMAP.model.v1.0.tgz
tar xvzf ANANSE.REMAP.model.v1.0.tgz -C ANANSE.REMAP.model.v1.0
rm ANANSE.REMAP.model.v1.0.tgz
wget https://zenodo.org/record/4769814/files/ANANSE_example_data.tgz
tar xvzf ANANSE_example_data.tgz
rm ANANSE_example_data.tgz
ananse binding -H ANANSE_example_data/H3K27ac/fibroblast*bam -A ANANSE_example_data/ATAC/fibroblast*bam -R ANANSE.REMAP.model.v1.0/ -o fibroblast.binding
ananse binding -H ANANSE_example_data/H3K27ac/heart*bam -A ANANSE_example_data/ATAC/heart*bam -R ANANSE.REMAP.model.v1.0/ -o heart.binding
ananse network -b fibroblast.binding/binding.h5 -e ANANSE_example_data/RNAseq/fibroblast*TPM.txt -n 4 -o fibroblast.network.txt
ananse network -b heart.binding/binding.h5 -e ANANSE_example_data/RNAseq/heart*TPM.txt -n 4 -o heart.network.txt
ananse influence -s fibroblast.network.txt -t heart.network.txt -d ANANSE_example_data/RNAseq/fibroblast2heart_degenes.csv -p -o fibroblast2heart.influence.txt
Anchorwave
Introduction
Anchorwave
is used for sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism and whole-genome duplication variation.
Versions
1.0.1
Commands
anchorwave
gmap_build
gmap
minimap2
Module
You can load the modules by:
module load biocontainers
module load anchorwave
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Anchorwave on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=anchorwave
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers anchorwave
anchorwave gff2seq -i Zea_mays.AGPv4.34.gff3 -r Zea_mays.AGPv4.dna.toplevel.fa -o cds.fa
ANGSD
Introduction
ANGSD
is a software for analyzing next generation sequencing data. Detailed usage can be found here: http://www.popgen.dk/angsd/index.php/ANGSD.
Versions
0.935
0.937
0.939
0.940
Commands
angsd
realSFS
msToGlf
thetaStat
supersim
Module
You can load the modules by:
module load biocontainers
module load angsd/0.937
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run angsd on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=angsd
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers angsd/0.937
angsd -b bam.filelist -GL 1 -doMajorMinor 1 -doMaf 2 -P 5 -minMapQ 30 -minQ 20 -minMaf 0.05
Annogesic
Introduction
ANNOgesic is the swiss army knife for RNA-Seq based annotation of bacterial/archaeal genomes.
Versions
1.1.0
Commands
annogesic
Module
You can load the modules by:
module load biocontainers
module load annogesic
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run annogesic on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=annogesic
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers annogesic
ANNOGESIC_FOLDER=ANNOgesic
annogesic \
update_genome_fasta \
-c $ANNOGESIC_FOLDER/input/references/fasta_files/NC_009839.1.fa \
-m $ANNOGESIC_FOLDER/input/mutation_tables/mutation.csv \
-u NC_test.1 \
-pj $ANNOGESIC_FOLDER
ANNOVAR
Introduction
ANNOVAR
is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, hg38, as well as mouse, worm, fly, yeast and many others).
Versions
2022-01-13
Commands
annotate_variation.pl
coding_change.pl
convert2annovar.pl
retrieve_seq_from_fasta.pl
table_annovar.pl
variants_reduction.pl
Module
You can load the modules by:
module load biocontainers
module load annovar
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ANNOVAR on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=annovar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers annovar
annotate_variation.pl --buildver hg19 --downdb seq humandb/hg19_seq
convert2annovar.pl -format region -seqdir humandb/hg19_seq/ chr1:2000001-2000003
Antismash
Introduction
Antismash
Antismash allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes.
Versions
5.1.2
6.0.1
Commands
antismash
Module
You can load the modules by:
module load biocontainers
module load antismash
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Antismash on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=antismash
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers antismash
antismash --cb-general --cb-knownclusters --cb-subclusters --asf --pfam2go --smcog-trees seq.gbk
Anvio
Introduction
Anvio
is an analysis and visualization platform for ‘omics data.
Versions
7.0
Commands
anvi-analyze-synteny
anvi-cluster-contigs
anvi-compute-ani
anvi-compute-completeness
anvi-compute-functional-enrichment
anvi-compute-gene-cluster-homogeneity
anvi-compute-genome-similarity
anvi-convert-trnaseq-database
anvi-db-info
anvi-delete-collection
anvi-delete-hmms
anvi-delete-misc-data
anvi-delete-state
anvi-dereplicate-genomes
anvi-display-contigs-stats
anvi-display-metabolism
anvi-display-pan
anvi-display-structure
anvi-estimate-genome-completeness
anvi-estimate-genome-taxonomy
anvi-estimate-metabolism
anvi-estimate-scg-taxonomy
anvi-estimate-trna-taxonomy
anvi-experimental-organization
anvi-export-collection
anvi-export-contigs
anvi-export-functions
anvi-export-gene-calls
anvi-export-gene-coverage-and-detection
anvi-export-items-order
anvi-export-locus
anvi-export-misc-data
anvi-export-splits-and-coverages
anvi-export-splits-taxonomy
anvi-export-state
anvi-export-structures
anvi-export-table
anvi-gen-contigs-database
anvi-gen-fixation-index-matrix
anvi-gen-gene-consensus-sequences
anvi-gen-gene-level-stats-databases
anvi-gen-genomes-storage
anvi-gen-network
anvi-gen-phylogenomic-tree
anvi-gen-structure-database
anvi-gen-variability-matrix
anvi-gen-variability-network
anvi-gen-variability-profile
anvi-get-aa-counts
anvi-get-codon-frequencies
anvi-get-enriched-functions-per-pan-group
anvi-get-sequences-for-gene-calls
anvi-get-sequences-for-gene-clusters
anvi-get-sequences-for-hmm-hits
anvi-get-short-reads-from-bam
anvi-get-short-reads-mapping-to-a-gene
anvi-get-split-coverages
anvi-help
anvi-import-collection
anvi-import-functions
anvi-import-items-order
anvi-import-misc-data
anvi-import-state
anvi-import-taxonomy-for-genes
anvi-import-taxonomy-for-layers
anvi-init-bam
anvi-inspect
anvi-interactive
anvi-matrix-to-newick
anvi-mcg-classifier
anvi-merge
anvi-merge-bins
anvi-meta-pan-genome
anvi-migrate
anvi-oligotype-linkmers
anvi-pan-genome
anvi-profile
anvi-push
anvi-refine
anvi-rename-bins
anvi-report-linkmers
anvi-run-hmms
anvi-run-interacdome
anvi-run-kegg-kofams
anvi-run-ncbi-cogs
anvi-run-pfams
anvi-run-scg-taxonomy
anvi-run-trna-taxonomy
anvi-run-workflow
anvi-scan-trnas
anvi-script-add-default-collection
anvi-script-augustus-output-to-external-gene-calls
anvi-script-calculate-pn-ps-ratio
anvi-script-checkm-tree-to-interactive
anvi-script-compute-ani-for-fasta
anvi-script-enrichment-stats
anvi-script-estimate-genome-size
anvi-script-filter-fasta-by-blast
anvi-script-fix-homopolymer-indels
anvi-script-gen-CPR-classifier
anvi-script-gen-distribution-of-genes-in-a-bin
anvi-script-gen-help-pages
anvi-script-gen-hmm-hits-matrix-across-genomes
anvi-script-gen-programs-network
anvi-script-gen-programs-vignette
anvi-script-gen-pseudo-paired-reads-from-fastq
anvi-script-gen-scg-domain-classifier
anvi-script-gen-short-reads
anvi-script-gen_stats_for_single_copy_genes.R
anvi-script-gen_stats_for_single_copy_genes.py
anvi-script-gen_stats_for_single_copy_genes.sh
anvi-script-get-collection-info
anvi-script-get-coverage-from-bam
anvi-script-get-hmm-hits-per-gene-call
anvi-script-get-primer-matches
anvi-script-merge-collections
anvi-script-pfam-accessions-to-hmms-directory
anvi-script-predict-CPR-genomes
anvi-script-process-genbank
anvi-script-process-genbank-metadata
anvi-script-reformat-fasta
anvi-script-run-eggnog-mapper
anvi-script-snvs-to-interactive
anvi-script-tabulate
anvi-script-transpose-matrix
anvi-script-variability-to-vcf
anvi-script-visualize-split-coverages
anvi-search-functions
anvi-self-test
anvi-setup-interacdome
anvi-setup-kegg-kofams
anvi-setup-ncbi-cogs
anvi-setup-pdb-database
anvi-setup-pfams
anvi-setup-scg-taxonomy
anvi-setup-trna-taxonomy
anvi-show-collections-and-bins
anvi-show-misc-data
anvi-split
anvi-summarize
anvi-trnaseq
anvi-update-db-description
anvi-update-structure-database
anvi-upgrade
Module
You can load the modules by:
module load biocontainers
module load anvio
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Anvio on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=anvio
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers anvio
anvi-script-reformat-fasta assembly.fa -o contigs.fa -l 1000 --simplify-names --seq-type NT
anvi-gen-contigs-database -f contigs.fa -o contigs.db -n 'An example contigs database' --num-threads 8
anvi-display-contigs-stats contigs.db
anvi-setup-ncbi-cogs --cog-data-dir $PWD --num-threads 8 --just-do-it --reset
anvi-run-ncbi-cogs -c contigs.db --cog-data-dir COG20 --num-threads 8
Any2fasta
Introduction
Any2fasta can convert various sequence formats to FASTA.
Versions
0.4.2
Commands
any2fasta
Module
You can load the modules by:
module load biocontainers
module load any2fasta
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run any2fasta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=any2fasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers any2fasta
any2fasta input.gff > out.fasta
Arcs
Introduction
ARCS is a tool for scaffolding genome sequence assemblies using linked or long read sequencing data.
Versions
1.2.4
Commands
arcs
arcs-make
Module
You can load the modules by:
module load biocontainers
module load arcs
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run arcs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=arcs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers arcs
Ariba
Introduction
ARIBA is a tool that identifies antibiotic resistance genes by running local assemblies. It can also be used for MLST calling.
Versions
2.14.6
Commands
ariba
Module
You can load the modules by:
module load biocontainers
module load ariba
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ariba on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ariba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ariba
Ascatngs
Introduction
AscatNGS contains the Cancer Genome Projects workflow implementation of the ASCAT copy number algorithm for paired end sequencing.
Versions
4.5.0
Commands
alleleCounter.pl
ascatCnToVCF.pl
ascatCounts.pl
ascatFaiChunk.pl
ascatFailedCnCsv.pl
ascat.pl
ascatSnpPanelFromVcfs.pl
ascatSnpPanelGcCorrections.pl
ascatSnpPanelGenerator.pl
ascatSnpPanelMerge.pl
ascatToBigWig.pl
bamToBw.pl
blast2sam.pl
bowtie2sam.pl
bwa_aln.pl
bwa_mem.pl
cgpAppendIdsToVcf.pl
cgpVCFSplit.pl
export2sam.pl
interpolate_sam.pl
merge_or_mark.pl
novo2sam.pl
pkg-config.pl
psl2sam.pl
sam2vcf.pl
samtools.pl
seq_cache_populate.pl
soap2sam.pl
stag-autoschema.pl
stag-db.pl
stag-diff.pl
stag-drawtree.pl
stag-filter.pl
stag-findsubtree.pl
stag-flatten.pl
stag-grep.pl
stag-handle.pl
stag-itext2simple.pl
stag-itext2sxpr.pl
stag-itext2xml.pl
stag-join.pl
stag-merge.pl
stag-mogrify.pl
stag-parse.pl
stag-query.pl
stag-splitter.pl
stag-view.pl
stag-xml2itext.pl
wgsim_eval.pl
xam_coverage_bins.pl
zoom2sam.pl
Module
You can load the modules by:
module load biocontainers
module load ascatngs
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ascatngs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ascatngs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ascatngs
ASGAL
Introduction
ASGAL
(Alternative Splicing Graph ALigner) is a tool for detecting the alternative splicing events expressed in a RNA-Seq sample with respect to a gene annotation.
Versions
1.1.7
Commands
asgal
Module
You can load the modules by:
module load biocontainers
module load asgal
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ASGAL on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=asgal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers asgal
asgal -g input/genome.fa \
-a input/annotation.gtf \
-s input/sample_1.fa -o outputFolder
Aspera-connect
Introduction
Aspera Connect is software that allows download and upload data. The software includes a command line tool (ascp) that allows scripted data transfer.
Versions
4.2.6
Commands
ascp
ascp4
asperaconnect
asperaconnect.bin
asperaconnect-nmh
asperacrypt
asunprotect
Module
You can load the modules by:
module load biocontainers
module load aspera-connect
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run aspera-connect on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=aspera-connect
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers aspera-connect
Assembly-stats
Introduction
Assembly-stats
is a tool to get assembly statistics from FASTA and FASTQ files.
Versions
1.0.1
Commands
assembly-stats
Module
You can load the modules by:
module load biocontainers
module load assembly-stats
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Assembly-stats on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 00:10:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=assembly-stats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers assembly-stats
assembly-stats seq.fasta
Atac-seq-pipeline
Introduction
The ENCODE ATAC-seq pipeline is used for quality control and statistical signal processing of short-read sequencing data, producing alignments and measures of enrichment. It was developed by Anshul Kundaje’s lab at Stanford University.
Versions
2.1.3
Commands
10x_bam2fastq
SAMstats
SAMstatsParallel
ace2sam
aggregate_scores_in_intervals.py
align_print_template.py
alignmentSieve
annotate.py
annotateBed
axt_extract_ranges.py
axt_to_fasta.py
axt_to_lav.py
axt_to_maf.py
bamCompare
bamCoverage
bamPEFragmentSize
bamToBed
bamToFastq
bed12ToBed6
bedToBam
bedToIgv
bed_bigwig_profile.py
bed_build_windows.py
bed_complement.py
bed_count_by_interval.py
bed_count_overlapping.py
bed_coverage.py
bed_coverage_by_interval.py
bed_diff_basewise_summary.py
bed_extend_to.py
bed_intersect.py
bed_intersect_basewise.py
bed_merge_overlapping.py
bed_rand_intersect.py
bed_subtract_basewise.py
bedpeToBam
bedtools
bigwigCompare
blast2sam.pl
bnMapper.py
bowtie2sam.pl
bwa
chardetect
closestBed
clusterBed
complementBed
compress
computeGCBias
computeMatrix
computeMatrixOperations
correctGCBias
coverageBed
createDiff
cutadapt
cygdb
cython
cythonize
deeptools
div_snp_table_chr.py
download_metaseq_example_data.py
estimateReadFiltering
estimateScaleFactor
expandCols
export2sam.pl
faidx
fastaFromBed
find_in_sorted_file.py
flankBed
gene_fourfold_sites.py
genomeCoverageBed
getOverlap
getSeq_genome_wN
getSeq_genome_woN
get_objgraph
get_scores_in_intervals.py
gffutils-cli
groupBy
gsl-config
gsl-histogram
gsl-randist
idr
int_seqs_to_char_strings.py
interpolate_sam.pl
intersectBed
intersection_matrix.py
interval_count_intersections.py
interval_join.py
intron_exon_reads.py
jsondiff
lav_to_axt.py
lav_to_maf.py
line_select.py
linksBed
lzop_build_offset_table.py
mMK_bitset.py
macs2
maf_build_index.py
maf_chop.py
maf_chunk.py
maf_col_counts.py
maf_col_counts_all.py
maf_count.py
maf_covered_ranges.py
maf_covered_regions.py
maf_div_sites.py
maf_drop_overlapping.py
maf_extract_chrom_ranges.py
maf_extract_ranges.py
maf_extract_ranges_indexed.py
maf_filter.py
maf_filter_max_wc.py
maf_gap_frequency.py
maf_gc_content.py
maf_interval_alignibility.py
maf_limit_to_species.py
maf_mapping_word_frequency.py
maf_mask_cpg.py
maf_mean_length_ungapped_piece.py
maf_percent_columns_matching.py
maf_percent_identity.py
maf_print_chroms.py
maf_print_scores.py
maf_randomize.py
maf_region_coverage_by_src.py
maf_select.py
maf_shuffle_columns.py
maf_species_in_all_files.py
maf_split_by_src.py
maf_thread_for_species.py
maf_tile.py
maf_tile_2.py
maf_tile_2bit.py
maf_to_axt.py
maf_to_concat_fasta.py
maf_to_fasta.py
maf_to_int_seqs.py
maf_translate_chars.py
maf_truncate.py
maf_word_frequency.py
makeBAM.sh
makeDiff.sh
makeFastq.sh
make_unique
makepBAM_genome.sh
makepBAM_transcriptome.sh
mapBed
maq2sam-long
maq2sam-short
maskFastaFromBed
mask_quality.py
mergeBed
metaseq-cli
multiBamCov
multiBamSummary
multiBigwigSummary
multiIntersectBed
nib_chrom_intervals_to_fasta.py
nib_intervals_to_fasta.py
nib_length.py
novo2sam.pl
nucBed
one_field_per_line.py
out_to_chain.py
pairToBed
pairToPair
pbam2bam
pbam_mapped_transcriptome
pbt_plotting_example.py
peak_pie.py
plot-bamstats
plotCorrelation
plotCoverage
plotEnrichment
plotFingerprint
plotHeatmap
plotPCA
plotProfile
prefix_lines.py
pretty_table.py
print_unique
psl2sam.pl
py.test
pybabel
pybedtools
pygmentize
pytest
python-argcomplete-check-easy-install-script
python-argcomplete-tcsh
qv_to_bqv.py
randomBed
random_lines.py
register-python-argcomplete
sam2vcf.pl
samtools
samtools.pl
seq_cache_populate.pl
shiftBed
shuffleBed
slopBed
soap2sam.pl
sortBed
speedtest.py
subtractBed
table_add_column.py
table_filter.py
tagBam
tfloc_summary.py
ucsc_gene_table_to_intervals.py
undill
unionBedGraphs
varfilter.py
venn_gchart.py
venn_mpl.py
wgsim
wgsim_eval.pl
wiggle_to_array_tree.py
wiggle_to_binned_array.py
wiggle_to_chr_binned_array.py
wiggle_to_simple.py
windowBed
windowMaker
zoom2sam.pl
Module
You can load the modules by:
module load biocontainers
module load atac-seq-pipeline
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run atac-seq-pipeline on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=atac-seq-pipeline
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers atac-seq-pipeline
Ataqv
Introduction
Ataqv
is a toolkit for measuring and comparing ATAC-seq results, made in the Parker lab at the University of Michigan.
Versions
1.3.0
Commands
ataqv
Module
You can load the modules by:
module load biocontainers
module load ataqv
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ataqv on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ataqv
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ataqv
ataqv --peak-file sample_1_peaks.broadPeak \
--name sample_1 --metrics-file sample_1.ataqv.json.gz \
--excluded-region-file hg19.blacklist.bed.gz \
--tss-file hg19.tss.refseq.bed.gz \
--ignore-read-groups human sample_1.md.bam \
> sample_1.ataqv.out
ataqv --peak-file sample_2_peaks.broadPeak \
--name sample_2 --metrics-file sample_2.ataqv.json.gz \
--excluded-region-file hg19.blacklist.bed.gz \
--tss-file hg19.tss.refseq.bed.gz \
--ignore-read-groups human sample_2.md.bam \
> sample_2.ataqv.out
ataqv --peak-file sample_3_peaks.broadPeak \
--name sample_3 --metrics-file sample_3.ataqv.json.gz \
--excluded-region-file hg19.blacklist.bed.gz \
--tss-file hg19.tss.refseq.bed.gz \
--ignore-read-groups human sample_3.md.bam \
> sample_3.ataqv.out
mkarv my_fantastic_experiment sample_1.ataqv.json.gz sample_2.ataqv.json.gz sample_3.ataqv.json.gz
aTRAM
Introduction
aTRAM
(automated target restricted assembly method) is an iterative assembler that performs reference-guided local de novo assemblies using a variety of available methods.
Detailed usage can be found here: https://bioinformaticshome.com/tools/wga/descriptions/aTRAM.html
Versions
2.4.3
Commands
atram.py
atram_preprocessor.py
atram_stitcher.py
Module
You can load the modules by:
module load biocontainers
module load atram/2.4.3
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run aTRAM on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=atram
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers atram/2.4.3a
atram_preprocessor.py --blast-db=atram_db \
--end-1=data/tutorial_end_1.fasta.gz \
--end-2=data/tutorial_end_2.fasta.gz \
--gzip
atram.py --query=tutorial-query.pep.fasta \
--blast-db=atram_db \
--output=output \
--assembler=velvet
Atropos
Introduction
Atropos
is a tool for specific, sensitive, and speedy trimming of NGS reads.
Versions
1.1.17
1.1.31
Commands
atropos
Module
You can load the modules by:
module load biocontainers
module load atropos
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Atropos on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=atropos
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers atropos
atropos --threads 4 \
-a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTTA \
-o trimmed1.fq.gz -p trimmed2.fq.gz \
-pe1 SRR13176582_1.fastq -pe2 SRR13176582_2.fastq
Augur
Introduction
Augur
is the bioinformatics toolkit we use to track evolution from sequence and serological data.
Versions
14.0.0
15.0.0
Commands
augur
Module
You can load the modules by:
module load biocontainers
module load augur
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Augur on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=augur
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers augur
mkdir -p results
augur index --sequences zika-tutorial/data/sequences.fasta \
--output results/sequence_index.tsv
augur filter --sequences zika-tutorial/data/sequences.fasta \
--sequence-index results/sequence_index.tsv \
--metadata zika-tutorial/data/metadata.tsv \
--exclude zika-tutorial/config/dropped_strains.txt \
--output results/filtered.fasta \
--group-by country year month \
--sequences-per-group 20 \
--min-date 2012
augur align --sequences results/filtered.fasta \
--reference-sequence zika-tutorial/config/zika_outgroup.gb \
--output results/aligned.fasta \
--fill-gaps
augur tree --alignment results/aligned.fasta \
--output results/tree_raw.nwk
augur refine --tree results/tree_raw.nwk \
--alignment results/aligned.fasta \
--metadata zika-tutorial/data/metadata.tsv \
--output-tree results/tree.nwk \
--output-node-data results/branch_lengths.json \
--timetree \
--coalescent opt \
--date-confidence \
--date-inference marginal \
--clock-filter-iqd 4
AUGUSTUS
Introduction
AUGUSTUS
is a program that predicts genes in eukaryotic genomic sequences.
Versions
3.4.0
3.5.0
Commands
aln2wig
augustus
bam2wig
bam2wig-dist
consensusFinder
curve2hints
etraining
fastBlockSearch
filterBam
getSeq
getSeq-dist
homGeneMapping
joingenes
prepareAlign
Module
You can load the modules by:
module load biocontainers
module load augustus/3.4.0
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run AUGUSTUS on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=AUGUSTUS
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers augustus/3.4.0
augustus --species=botrytis_cinerea genome.fasta > annotation.gff
Bactopia
Introduction
Bactopia is a flexible pipeline for complete analysis of bacterial genomes. The goal of Bactopia is to process your data with a broad set of tools, so that you can get to the fun part of analyses quicker!
Versions
2.0.3
2.1.1
2.2.0
3.0.0
Commands
bactopia
Module
You can load the modules by:
module load biocontainers
module load bactopia
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bactopia on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=bactopia
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bactopia
bactopia datasets \
--ariba "vfdb_core,card" \
--species "Staphylococcus aureus" \
--include_genus \
--limit 100 \
--cpus 12
bactopia --accession SRX4563634 \
--datasets datasets/ \
--species "Staphylococcus aureus" \
--coverage 100 \
--genome_size median \
--outdir ena-single-sample \
--max_cpus 12
Bali-phy
Introduction
Bali-phy is a tool for bayesian co-estimation of phylogenies and multiple alignments via MCMC.
Versions
3.6.0
Commands
bali-phy
Module
You can load the modules by:
module load biocontainers
module load bali-phy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bali-phy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bali-phy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bali-phy
bali-phy examples/sequences/ITS/ITS1.fasta 5.8S.fasta ITS2.fasta --test
bali-phy examples/sequences/5S-rRNA/5d-clustalw.fasta -S gtr+Rates.gamma[4]+inv -n 5d-free
Bamgineer
Introduction
Bamgineer
is a tool that can be used to introduce user-defined haplotype-phased allele-specific copy number variations (CNV) into an existing Binary Alignment Mapping (BAM) file with demonstrated applicability to simulate somatic cancer CNVs in phased whole-genome sequencing datsets.
Versions
1.1
Commands
simulate.py
Module
You can load the modules by:
module load biocontainers
module load bamgineer
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bamgineer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamgineer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamgineer
simulate.py -config inputs/config.cfg \
-splitbamdir splitbams \
-cnv_bed inputs/cnv.bed \
-vcf inputs/normal_het.vcf \
-exons inputs/exons.bed \
-outbam tumour.bam \
-results outputs \
-cancertype LUAC1
Bamliquidator
Introduction
Bamliquidator
is a set of tools for analyzing the density of short DNA sequence read alignments in the BAM file format.
Versions
1.5.2
Commands
bamliquidator
bamliquidator_bins
bamliquidator_regions
bamliquidatorbatch
Module
You can load the modules by:
module load biocontainers
module load bamliquidator
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bamliquidator on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamliquidator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamliquidator
Bam-readcount
Introduction
Bam-readcount
is a utility that runs on a BAM or CRAM file and generates low-level information about sequencing data at specific nucleotide positions.
Versions
1.0.0
Commands
bam-readcount
Module
You can load the modules by:
module load biocontainers
module load bam-readcount
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bam-readcount on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bam-readcount
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bam-readcount
bam-readcount -f Homo_sapiens.GRCh38.dna.primary_assembly.fa Aligned.sortedByCoord.out.bam
Bamsurgeon
Introduction
Bamsurgeon
are tools for adding mutations to .bam files, used for testing mutation callers.
Versions
1.2
Commands
addindel.py
addsnv.py
addsv.py
Module
You can load the modules by:
module load biocontainers
module load bamsurgeon
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bamsurgeon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamsurgeon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamsurgeon
addsv.py -p 1 -v test_sv.txt -f testregion_realign.bam \
-r reference.fasta -o testregion_sv_mut.bam \
--aligner mem --keepsecondary --seed 1234 \
--inslib test_inslib.fa
BamTools
Introduction
BamTools
is a programmer API and an end-user toolkit for handling BAM files. This container provides a toolkit-only version (no API to build against).
Versions
2.5.1
Commands
bamtools
Module
You can load the modules by:
module load biocontainers
module load bamtools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BamTools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH -ddd-error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamtools
bamtools convert -format fastq -in in.bam -out out.fastq
Bamutil
Introduction
Bamutil
is a collection of programs for working on SAM/BAM files.
Versions
1.0.15
Commands
bam
Module
You can load the modules by:
module load biocontainers
module load bamutil
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bamutil on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamutil
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamutil
bam validate --params --in test/testFiles/testInvalid.sam --refFile test/testFilesLibBam/chr1_partial.fa --v --noph 2> results/validateInvalid.txt
bam convert --params --in test/testFiles/testFilter.bam --out results/convertBam.sam --noph 2> results/convertBam.log
bam splitChromosome --in test/testFile/sortedBam1.bam --out results/splitSortedBam --noph 2> results/splitChromosome.txt
bam stats --basic --in test/testFiles/testFilter.sam --noph 2> results/basicStats.txt
bam gapInfo --in test/testFiles/testGapInfo.sam --out results/gapInfo.txt --noph 2> results/gapInfo.log
bam findCigars --in test/testFiles/testRevert.sam --out results/cigarNonM.sam --nonM --noph 2> results/cigarNonM.log
Barrnap
Introduction
Barrnap
: BAsic Rapid Ribosomal RNA Predictor.
Versions
0.9.4
Commands
barrnap
Module
You can load the modules by:
module load biocontainers
module load barrnap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Barrnap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=barrnap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers barrnap
barrnap --kingdom bac -o bac_16s.fasta < bac_genome.fasta > bac_16s.gff3
barrnap --kingdom euk -o euk_16s.fasta < euk_genome.fasta > euk_16s.gff3
Basenji
Introduction
Basenji
is a tool for sequential regulatory activity predictions with deep convolutional neural networks.
Versions
0.5.1
Commands
akita_data.py
akita_data_read.py
akita_data_write.py
akita_predict.py
akita_sat_plot.py
akita_sat_vcf.py
akita_scd.py
akita_scd_multi.py
akita_test.py
akita_train.py
bam_cov.py
basenji_annot_chr.py
basenji_bench_classify.py
basenji_bench_gtex.py
basenji_bench_gtex_cmp.py
basenji_bench_phylop.py
basenji_bench_phylop_folds.py
basenji_cmp.py
basenji_data.py
basenji_data2.py
basenji_data_align.py
basenji_data_gene.py
basenji_data_hic_read.py
basenji_data_hic_write.py
basenji_data_read.py
basenji_data_write.py
basenji_fetch_app.py
basenji_fetch_app1.py
basenji_fetch_app2.py
basenji_fetch_norm.py
basenji_fetch_vcf.py
basenji_gtex_folds.py
basenji_hdf5_genes.py
basenji_hidden.py
basenji_map.py
basenji_map_genes.py
basenji_map_seqs.py
basenji_motifs.py
basenji_motifs_denovo.py
basenji_norm_h5.py
basenji_predict.py
basenji_predict_bed.py
basenji_predict_bed_multi.py
basenji_sad.py
basenji_sad_multi.py
basenji_sad_norm.py
basenji_sad_ref.py
basenji_sad_ref_multi.py
basenji_sad_table.py
basenji_sat_bed.py
basenji_sat_bed_multi.py
basenji_sat_folds.py
basenji_sat_plot.py
basenji_sat_plot2.py
basenji_sat_vcf.py
basenji_sed.py
basenji_sed_multi.py
basenji_sedg.py
basenji_test.py
basenji_test_folds.py
basenji_test_genes.py
basenji_test_reps.py
basenji_test_specificity.py
basenji_train.py
basenji_train1.py
basenji_train2.py
basenji_train_folds.py
basenji_train_hic.py
basenji_train_reps.py
save_model.py
sonnet_predict_bed.py
sonnet_sad.py
sonnet_sad_multi.py
sonnet_sat_bed.py
sonnet_sat_vcf.py
tfr_bw.py
tfr_hdf5.py
tfr_qc.py
upgrade_tf1.py
Module
You can load the modules by:
module load biocontainers
module load basenji
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Basenji on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=basenji
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers basenji
Bayescan
Introduction
BayeScan aims at identifying candidate loci under natural selection from genetic data, using differences in allele frequencies between populations.
Versions
2.1
Commands
bayescan
Module
You can load the modules by:
module load biocontainers
module load bayescan
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bayescan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bayescan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bayescan
Bazam
Introduction
Bazam is a tool to extract paired reads in FASTQ format from coordinate sorted BAM files. For more information, please check: Docker hub: https://hub.docker.com/r/dockanomics/bazam Home page: https://github.com/ssadedin/bazam
Versions
1.0.1
Commands
bazam
Module
You can load the modules by:
module load biocontainers
module load bazam
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bazam on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bazam
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bazam
Bbmap
Introduction
Bbmap
is a short read aligner, as well as various other bioinformatic tools.
Versions
38.93
38.96
Commands
addadapters.sh
a_sample_mt.sh
bbcountunique.sh
bbduk.sh
bbest.sh
bbfakereads.sh
bbmap.sh
bbmapskimmer.sh
bbmask.sh
bbmerge-auto.sh
bbmergegapped.sh
bbmerge.sh
bbnorm.sh
bbqc.sh
bbrealign.sh
bbrename.sh
bbsketch.sh
bbsplitpairs.sh
bbsplit.sh
bbstats.sh
bbversion.sh
bbwrap.sh
calcmem.sh
calctruequality.sh
callpeaks.sh
callvariants2.sh
callvariants.sh
clumpify.sh
commonkmers.sh
comparesketch.sh
comparevcf.sh
consect.sh
countbarcodes.sh
countgc.sh
countsharedlines.sh
crossblock.sh
crosscontaminate.sh
cutprimers.sh
decontaminate.sh
dedupe2.sh
dedupebymapping.sh
dedupe.sh
demuxbyname.sh
diskbench.sh
estherfilter.sh
explodetree.sh
filterassemblysummary.sh
filterbarcodes.sh
filterbycoverage.sh
filterbyname.sh
filterbysequence.sh
filterbytaxa.sh
filterbytile.sh
filterlines.sh
filtersam.sh
filtersubs.sh
filtervcf.sh
fungalrelease.sh
fuse.sh
getreads.sh
gi2ancestors.sh
gi2taxid.sh
gitable.sh
grademerge.sh
gradesam.sh
idmatrix.sh
idtree.sh
invertkey.sh
kcompress.sh
khist.sh
kmercountexact.sh
kmercountmulti.sh
kmercoverage.sh
loadreads.sh
loglog.sh
makechimeras.sh
makecontaminatedgenomes.sh
makepolymers.sh
mapPacBio.sh
matrixtocolumns.sh
mergebarcodes.sh
mergeOTUs.sh
mergesam.sh
msa.sh
mutate.sh
muxbyname.sh
normandcorrectwrapper.sh
partition.sh
phylip2fasta.sh
pileup.sh
plotgc.sh
postfilter.sh
printtime.sh
processfrag.sh
processspeed.sh
randomreads.sh
readlength.sh
reducesilva.sh
reformat.sh
removebadbarcodes.sh
removecatdogmousehuman.sh
removehuman2.sh
removehuman.sh
removemicrobes.sh
removesmartbell.sh
renameimg.sh
rename.sh
repair.sh
replaceheaders.sh
representative.sh
rqcfilter.sh
samtoroc.sh
seal.sh
sendsketch.sh
shred.sh
shrinkaccession.sh
shuffle.sh
sketchblacklist.sh
sketch.sh
sortbyname.sh
splitbytaxa.sh
splitnextera.sh
splitsam4way.sh
splitsam6way.sh
splitsam.sh
stats.sh
statswrapper.sh
streamsam.sh
summarizecrossblock.sh
summarizemerge.sh
summarizequast.sh
summarizescafstats.sh
summarizeseal.sh
summarizesketch.sh
synthmda.sh
tadpipe.sh
tadpole.sh
tadwrapper.sh
taxonomy.sh
taxserver.sh
taxsize.sh
taxtree.sh
testfilesystem.sh
testformat2.sh
testformat.sh
tetramerfreq.sh
textfile.sh
translate6frames.sh
unicode2ascii.sh
webcheck.sh
Module
You can load the modules by:
module load biocontainers
module load bbmap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bbmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bbmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bbmap
stats.sh in=SRR11234553_1.fastq > stats_out.txt
statswrapper.sh *.fastq > statswrapper_out.txt
pileup.sh in=map1.sam out=pileup_out.txt
readlength.sh in=SRR11234553_1.fastq in2=SRR11234553_2.fastq > readlength_out.txt
kmercountexact.sh in=SRR11234553_1.fastq in2=SRR11234553_2.fastq out=kmer_test.out khist=kmer.khist peaks=kmer.peak
bbmask.sh in=SRR11234553_1.fastq out=test.mark sam=map1.sam
Bbtools
Introduction
BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data.
Versions
39.00
Commands
Xcalcmem.sh
a_sample_mt.sh
addadapters.sh
addssu.sh
adjusthomopolymers.sh
alltoall.sh
analyzeaccession.sh
analyzegenes.sh
analyzesketchresults.sh
applyvariants.sh
bbcms.sh
bbcountunique.sh
bbduk.sh
bbest.sh
bbfakereads.sh
bbmap.sh
bbmapskimmer.sh
bbmask.sh
bbmerge-auto.sh
bbmerge.sh
bbnorm.sh
bbrealign.sh
bbrename.sh
bbsketch.sh
bbsplit.sh
bbsplitpairs.sh
bbstats.sh
bbversion.sh
bbwrap.sh
bloomfilter.sh
calcmem.sh
calctruequality.sh
callgenes.sh
callpeaks.sh
callvariants.sh
callvariants2.sh
clumpify.sh
commonkmers.sh
comparegff.sh
comparesketch.sh
comparessu.sh
comparevcf.sh
consect.sh
consensus.sh
countbarcodes.sh
countgc.sh
countsharedlines.sh
crossblock.sh
crosscontaminate.sh
cutgff.sh
cutprimers.sh
decontaminate.sh
dedupe.sh
dedupe2.sh
dedupebymapping.sh
demuxbyname.sh
diskbench.sh
estherfilter.sh
explodetree.sh
fetchproks.sh
filterassemblysummary.sh
filterbarcodes.sh
filterbycoverage.sh
filterbyname.sh
filterbysequence.sh
filterbytaxa.sh
filterbytile.sh
filterlines.sh
filterqc.sh
filtersam.sh
filtersilva.sh
filtersubs.sh
filtervcf.sh
fixgaps.sh
fungalrelease.sh
fuse.sh
gbff2gff.sh
getreads.sh
gi2ancestors.sh
gi2taxid.sh
gitable.sh
grademerge.sh
gradesam.sh
icecreamfinder.sh
icecreamgrader.sh
icecreammaker.sh
idmatrix.sh
idtree.sh
invertkey.sh
kapastats.sh
kcompress.sh
keepbestcopy.sh
khist.sh
kmercountexact.sh
kmercountmulti.sh
kmercoverage.sh
kmerfilterset.sh
kmerlimit.sh
kmerlimit2.sh
kmerposition.sh
kmutate.sh
lilypad.sh
loadreads.sh
loglog.sh
makechimeras.sh
makecontaminatedgenomes.sh
makepolymers.sh
mapPacBio.sh
matrixtocolumns.sh
mergeOTUs.sh
mergebarcodes.sh
mergepgm.sh
mergeribo.sh
mergesam.sh
mergesketch.sh
mergesorted.sh
msa.sh
mutate.sh
muxbyname.sh
partition.sh
phylip2fasta.sh
pileup.sh
plotflowcell.sh
plotgc.sh
postfilter.sh
printtime.sh
processfrag.sh
processhi-c.sh
processspeed.sh
randomgenome.sh
randomreads.sh
readlength.sh
readqc.sh
reducesilva.sh
reformat.sh
reformatpb.sh
removebadbarcodes.sh
removecatdogmousehuman.sh
removehuman.sh
removehuman2.sh
removemicrobes.sh
removesmartbell.sh
rename.sh
renameimg.sh
repair.sh
replaceheaders.sh
representative.sh
rqcfilter.sh
rqcfilter2.sh
runhmm.sh
samtoroc.sh
seal.sh
sendsketch.sh
shred.sh
shrinkaccession.sh
shuffle.sh
shuffle2.sh
sketch.sh
sketchblacklist.sh
sketchblacklist2.sh
sortbyname.sh
splitbytaxa.sh
splitnextera.sh
splitribo.sh
splitsam.sh
splitsam4way.sh
splitsam6way.sh
stats.sh
statswrapper.sh
streamsam.sh
subsketch.sh
summarizecontam.sh
summarizecoverage.sh
summarizecrossblock.sh
summarizemerge.sh
summarizequast.sh
summarizescafstats.sh
summarizeseal.sh
summarizesketch.sh
synthmda.sh
tadpipe.sh
tadpole.sh
tadwrapper.sh
taxonomy.sh
taxserver.sh
taxsize.sh
taxtree.sh
testfilesystem.sh
testformat.sh
testformat2.sh
tetramerfreq.sh
textfile.sh
translate6frames.sh
unicode2ascii.sh
unzip.sh
vcf2gff.sh
webcheck.sh
Module
You can load the modules by:
module load biocontainers
module load bbtools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bbtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bbtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bbtools
Bcftools
Introduction
Bcftools
is a program for variant calling and manipulating files in the Variant Call Format (VCF) and its binary counterpart BCF.
Versions
1.13
1.14
1.17
Commands
bcftools
color-chrs.pl
guess-ploidy.py
plot-roh.py
plot-vcfstats
run-roh.pl
vcfutils.pl
Module
You can load the modules by:
module load biocontainers
module load bcftools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bcftools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bcftools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bcftools
bcftools query -f '%CHROM %POS %REF %ALT\n' file.bcf
bcftools polysomy -v -o outdir/ file.vcf
# Variant calling
bcftools mpileup -f reference.fa alignments.bam | bcftools call -mv -Ob -o calls.bcf
Bcl2fastq
Introduction
bcl2fastq Conversion Software both demultiplexes data and converts BCL files generated by Illumina sequencing systems to standard FASTQ file formats for downstream analysis.
Versions
2.20.0
Commands
bcl2fastq
Module
You can load the modules by:
module load biocontainers
module load bcl2fastq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bcl2fastq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bcl2fastq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bcl2fastq
Beagle
Introduction
Beagle
is a software package for phasing genotypes and for imputing ungenotyped markers. Start it with: beagle [java options] [arguments]
Note: Bref is not installed in this container.
Versions
5.1_24Aug19.3e8
Commands
beagle
Module
You can load the modules by:
module load biocontainers
module load beagle
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Beagle on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=beagle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers beagle
beagle gt=test.vcf.gz out=test.out
BEAST 2
Introduction
BEAST 2
is a cross-platform program for Bayesian phylogenetic analysis of molecular sequences.
Versions
2.6.3
2.6.4
2.6.6
Commands
applauncher
beast
beauti
densitree
loganalyser
logcombiner
packagemanager
treeannotator
Module
You can load the modules by:
module load biocontainers
module load beast2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BEAST 2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=beast2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers beast2
beast -threads 4 -prefix input input.xml
Bedops
Introduction
Bedops
is a software package for manipulating and analyzing genomic interval data.
Versions
2.4.39
Commands
bam2bed
bam2bed-float128
bam2bed_gnuParallel
bam2bed_gnuParallel-float128
bam2bed_gnuParallel-megarow
bam2bed_gnuParallel-typical
bam2bed-megarow
bam2bed_sge
bam2bed_sge-float128
bam2bed_sge-megarow
bam2bed_sge-typical
bam2bed_slurm
bam2bed_slurm-float128
bam2bed_slurm-megarow
bam2bed_slurm-typical
bam2bed-typical
bam2starch
bam2starch-float128
bam2starch_gnuParallel
bam2starch_gnuParallel-float128
bam2starch_gnuParallel-megarow
bam2starch_gnuParallel-typical
bam2starch-megarow
bam2starch_sge
bam2starch_sge-float128
bam2starch_sge-megarow
bam2starch_sge-typical
bam2starch_slurm
bam2starch_slurm-float128
bam2starch_slurm-megarow
bam2starch_slurm-typical
bam2starch-typical
bedextract
bedextract-float128
bedextract-megarow
bedextract-typical
bedmap
bedmap-float128
bedmap-megarow
bedmap-typical
bedops
bedops-float128
bedops-megarow
bedops-typical
closest-features
closest-features-float128
closest-features-megarow
closest-features-typical
convert2bed
convert2bed-float128
convert2bed-megarow
convert2bed-typical
gff2bed
gff2bed-float128
gff2bed-megarow
gff2bed-typical
gff2starch
gff2starch-float128
gff2starch-megarow
gff2starch-typical
gtf2bed
gtf2bed-float128
gtf2bed-megarow
gtf2bed-typical
gtf2starch
gtf2starch-float128
gtf2starch-megarow
gtf2starch-typical
gvf2bed
gvf2bed-float128
gvf2bed-megarow
gvf2bed-typical
gvf2starch
gvf2starch-float128
gvf2starch-megarow
gvf2starch-typical
psl2bed
psl2bed-float128
psl2bed-megarow
psl2bed-typical
psl2starch
psl2starch-float128
psl2starch-megarow
psl2starch-typical
rmsk2bed
rmsk2bed-float128
rmsk2bed-megarow
rmsk2bed-typical
rmsk2starch
rmsk2starch-float128
rmsk2starch-megarow
rmsk2starch-typical
sam2bed
sam2bed-float128
sam2bed-megarow
sam2bed-typical
sam2starch
sam2starch-float128
sam2starch-megarow
sam2starch-typical
sort-bed
sort-bed-float128
sort-bed-megarow
sort-bed-typical
starch
starchcat
starchcat-float128
starchcat-megarow
starchcat-typical
starchcluster_gnuParallel
starchcluster_gnuParallel-float128
starchcluster_gnuParallel-megarow
starchcluster_gnuParallel-typical
starchcluster_sge
starchcluster_sge-float128
starchcluster_sge-megarow
starchcluster_sge-typical
starchcluster_slurm
starchcluster_slurm-float128
starchcluster_slurm-megarow
starchcluster_slurm-typical
starch-diff
starch-diff-float128
starch-diff-megarow
starch-diff-typical
starch-float128
starch-megarow
starchstrip
starchstrip-float128
starchstrip-megarow
starchstrip-typical
starch-typical
switch-BEDOPS-binary-type
unstarch
unstarch-float128
unstarch-megarow
unstarch-typical
update-sort-bed-migrate-candidates
update-sort-bed-migrate-candidates-float128
update-sort-bed-migrate-candidates-megarow
update-sort-bed-migrate-candidates-typical
update-sort-bed-slurm
update-sort-bed-slurm-float128
update-sort-bed-slurm-megarow
update-sort-bed-slurm-typical
update-sort-bed-starch-slurm
update-sort-bed-starch-slurm-float128
update-sort-bed-starch-slurm-megarow
update-sort-bed-starch-slurm-typical
vcf2bed
vcf2bed-float128
vcf2bed-megarow
vcf2bed-typical
vcf2starch
vcf2starch-float128
vcf2starch-megarow
vcf2starch-typical
wig2bed
wig2bed-float128
wig2bed-megarow
wig2bed-typical
wig2starch
wig2starch-float128
wig2starch-megarow
wig2starch-typical
Module
You can load the modules by:
module load biocontainers
module load bedops
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bedops on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bedops
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bedops
bedops -m 001.merge.001.test > 001.merge.001.observed
bedops -c 001.merge.001.test > 001.complement.001.observed
bedops -i 001.intersection.001a.test 001.intersection.001b.test > 001.intersection.001.observed
Bedtools
Introduction
Bedtools
is an extensive suite of utilities for genome arithmetic and comparing genomic features in BED format.
Versions
2.30.0
2.31.0
Commands
annotateBed
bamToBed
bamToFastq
bed12ToBed6
bedpeToBam
bedToBam
bedToIgv
bedtools
closestBed
clusterBed
complementBed
coverageBed
expandCols
fastaFromBed
flankBed
genomeCoverageBed
getOverlap
groupBy
intersectBed
linksBed
mapBed
maskFastaFromBed
mergeBed
multiBamCov
multiIntersectBed
nucBed
pairToBed
pairToPair
randomBed
shiftBed
shuffleBed
slopBed
sortBed
subtractBed
tagBam
unionBedGraphs
windowBed
windowMaker
Module
You can load the modules by:
module load biocontainers
module load bedtools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bedtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bedtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bedtools
bedtools intersect -a a.bed -b b.bed
bedtools annotate -i variants.bed -files genes.bed conserve.bed known_var.bed
Bioawk
Introduction
Bioawk
is an extension to Brian Kernighan’s awk, adding the support of several common biological data formats, including optionally gzip’ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names.
Versions
1.0
Commands
bioawk
Module
You can load the modules by:
module load biocontainers
module load bioawk
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bioawk on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bioawk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bioawk
bioawk -c fastx '{print ">"$name;print revcomp($seq)}' seq.fa.gz
Biobambam
Introduction
Biobambam
is a collection of tools for early stage alignment file processing.
Versions
2.0.183
Commands
bam12auxmerge
bam12split
bam12strip
bamadapterclip
bamadapterfind
bamalignfrac
bamauxmerge
bamauxmerge2
bamauxsort
bamcat
bamchecksort
bamclipXT
bamclipreinsert
bamcollate2
bamdepth
bamdepthintersect
bamdifference
bamdownsamplerandom
bamexplode
bamexploderef
bamfastcat
bamfastexploderef
bamfastnumextract
bamfastsplit
bamfeaturecount
bamfillquery
bamfilteraux
bamfiltereofblocks
bamfilterflags
bamfilterheader
bamfilterheader2
bamfilterk
bamfilterlength
bamfiltermc
bamfilternames
bamfilterrefid
bamfilterrg
bamfixmateinformation
bamfixpairinfo
bamflagsplit
bamindex
bamintervalcomment
bamintervalcommenthist
bammapdist
bammarkduplicates
bammarkduplicates2
bammarkduplicatesopt
bammaskflags
bammdnm
bammerge
bamnumericalindex
bamnumericalindexstats
bamrank
bamranksort
bamrecalculatecigar
bamrecompress
bamrefextract
bamrefinterval
bamreheader
bamreplacechecksums
bamreset
bamscrapcount
bamseqchksum
bamsormadup
bamsort
bamsplit
bamsplitdiv
bamstreamingmarkduplicates
bamtofastq
bamvalidate
bamzztoname
Module
You can load the modules by:
module load biocontainers
module load biobambam
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Biobambam on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=biobambam
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers biobambam
bammarkduplicates I=Aligned.sortedByCoord.out.bam O=out.bam D=duplcate_out
bamsort I=Aligned.sortedByCoord.out.bam O=sorted.bam sortthreads=8
bamtofastq filename=Aligned.sortedByCoord.out.bam outputdir=fastq_out
Bioconvert
Introduction
Bioconvert
is a collaborative project to facilitate the interconversion of life science data from one format to another.
Versions
0.4.3
0.5.2
0.6.1
0.6.2
Commands
bioconvert
Module
You can load the modules by:
module load biocontainers
module load bioconvert
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bioconvert on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bioconvert
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bioconvert
bioconvert fastq2fasta input.fastq output.fa
Biopython
Introduction
Biopython
is a set of freely available tools for biological computation written in Python.
Versions
1.70-np112py27
1.70-np112py36
1.78
Commands
easy_install
f2py
f2py3
idle3
pip
pip3
pydoc
pydoc3
python
python3
python3-config
python3.9
python3.9-config
wheel
Module
You can load the modules by:
module load biocontainers
module load biopython
Interactive job
To run biopython interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers biopython
(base) UserID@bell-a008:~ $ python
Python 3.9.1 | packaged by conda-forge | (default, Jan 26 2021, 01:34:10)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import SeqIO
>>> with open("input.gb") as input_handle:
for record in SeqIO.parse(input_handle, "genbank"):
print(record)
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Biopython on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=biopython
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers biopython
python script.py
Bismark
Introduction
Bismark
is a tool to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step.
Versions
0.23.0
0.24.0
Commands
bismark
bam2nuc
bismark2bedGraph
bismark2report
bismark2summary
bismark_genome_preparation
bismark_methylation_extractor
copy_bismark_files_for_release.pl
coverage2cytosine
deduplicate_bismark
filter_non_conversion
methylation_consistency
Dependencies
Bowtie v2.4.2
, Samtools v1.12
, HISAT2 v2.2.1
were included in the container image. So users do not need to provide the dependency path in the bismark parameter.
Module
You can load the modules by:
module load biocontainers
module load bismark
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bismark on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=bismark
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bismark
bismark_genome_preparation --bowtie2 data/ref_genome
bismark --multicore 12 --genome data/ref_genome seq.fastq
Blasr
Introduction
Blasr
Blasr is a read mapping program that maps reads to positions in a genome by clustering short exact matches between the read and the genome, and scoring clusters using alignment.
Versions
5.3.5
Commands
blasr
Module
You can load the modules by:
module load biocontainers
module load blasr
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Blasr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=blasr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers blasr
blasr reads.bas.h5 ecoli_K12.fasta -sam
BLAST
Introduction
BLAST
(Basic Local Alignment Search Tool) finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
Versions
2.11.0
2.13.0
Commands
blastn
blastp
blastx
blast_formatter
amino-acid-composition
between-two-genes
blastdbcheck
blastdbcmd
blastdb_aliastool
cleanup-blastdb-volumes.py
deltablast
dustmasker
eaddress
eblast
get_species_taxids.sh
legacy_blast.pl
makeblastdb
makembindex
makeprofiledb
psiblast
rpsblast
rpstblastn
run-ncbi-converter
segmasker
tblastn
tblastx
update_blastdb.pl
windowmasker
Module
You can load the modules by:
module load biocontainers
module load blast
BLAST Databases
Local copies of the blast dabase can be found in the directory /depot/itap/datasets/blast/latest/. The environment varialbe BLASTDB
was also set as /depot/itap/datasets/blast/latest/
. If users want to use cdd_delta
, env_nr
, env_nt
, nr
, nt
, pataa
, patnt
, pdbnt
, refseq_protein
, refseq_rna
, swissprot
, or tsa_nt
databases, do not need to provide the database path. Instead, just use the format like this -db nr
.
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BLAST on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=blast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers blast
blastp -query protein.fasta -db nr -out test_out -num_threads 4
BlobTools
Introduction
BlobTools
is a modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets.
Detailed usage can be found here: https://github.com/DRL/blobtools
Versions
1.1.1
Commands
blobtools
Module
You can load the modules by:
module load biocontainers
module load blobtools/1.1.1
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run blobtools on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=blobtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers blobtools/1.1.1
blobtools create -i example/assembly.fna -b example/mapping_1.sorted.bam -t example/blast.out -o test && \
blobtools view -i test.blobDB.json && \
blobtools plot -i test.blobDB.json
Bmge
Introduction
Bmge
is a program that selects regions in a multiple sequence alignment that are suited for phylogenetic inference.
Versions
1.12
Commands
bmge
Module
You can load the modules by:
module load biocontainers
module load bmge
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bmge on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bmge
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bmge
bmge -i seq.fa -t AA -o out.phy
Bowtie
Introduction
Bowtie
is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).
Versions
1.3.1
Commands
bowtie
bowtie-build
bowtie-inspect
Module
You can load the modules by:
module load biocontainers
module load bowtie
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bowtie on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bowtie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bowtie
bowtie-build ref.fasta ref
bowtie -p 4 -x ref -1 input_1.fq -2 input_2.fq -S test.sam
Bowtie 2
Introduction
``Bowtie 2``is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.
Versions
2.4.2
2.5.1
Commands
bowtie2
bowtie2-build
bowtie2-inspect
Module
You can load the modules by:
module load biocontainers
module load bowtie2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bowtie 2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bowtie2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bowtie2
bowtie2-build ref.fasta ref
bowtie2 -p 4 -x ref -1 input_1.fq -2 input_2.fq -S test.sam
Bracken
Introduction
Bracken
(Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.
Detailed usage can be found here: https://github.com/jenniferlu717/Bracken
Note
Inside the bracken
container image, kraken2
was also installed. As a result, when you load bracken/2.6.1-py37
, kraken version 2.1.1
will be automatically loaded. Please do not load kraken2
module together with bracken
module to avaoid conflict.
Versions
2.6.1
2.7
Commands
bracken
bracken-build
combine_bracken_outputs.py
kraken2
kraken2-build
kraken2-inspect
combine_bracken_outputs.py
est_abundance.py
generate_kmer_distribution.py
Module
You can load the modules by:
module load biocontainers
module load bracken/2.6.1-py37
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bracken on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=bracken
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bracken/2.6.1-py37
DATABASE=minikraken2_v2_8GB_201904_UPDATE
kraken2 --threads 24 --report kranken2.report --db $DATABASE --paired --classified-out cseqs#.fq SRR5043021_1.fastq SRR5043021_2.fastq
bracken -d $DATABASE -i kranken2.report -o bracken_output -w bracken.report
BRAKER
Introduction
BRAKER
is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET and AUGUSTUS in novel eukaryotic genomes.
Versions
2.1.6
Commands
braker.pl
Helper command
Note
Since BRAKER
is a pipeline that trains AUGUSTUS
, i.e. writes species specific parameter files, BRAKER needs writing access to the configuration directory of AUGUSTUS that contains such files. This installation comes with a stub of AUGUSTUS coniguration files, but you must
copy them out from the container into a location where you have write permissions.
A helper command copy_augustus_config
is provided to simplify the task. Follow the procedure below to put the config files in your scratch space:
$ mkdir -p $RCAC_SCRATCH/augustus
$ copy_augustus_config $RCAC_SCRATCH/augustus
$ export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config
Module
You can load the modules by:
module load biocontainers
module load braker2/2.1.6
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BRAKER on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=BRAKER2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers braker2/2.1.6
# The augustus config step is only required for the first time to use BRAKER2
mkdir -p $RCAC_SCRATCH/augustus
copy_augustus_config $RCAC_SCRATCH/augustus
export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config
braker.pl --genome genome.fa --bam RNAseq.bam --softmasking --cores 24
Brass
Introduction
Brass
is used to analyze one or more related BAM files of paired-end sequencing to determine potential rearrangement breakpoints.
Versions
6.3.4
Commands
brass-assemble
brass_bedpe2vcf.pl
brass_foldback_reads.pl
brass-group
brassI_filter.pl
brassI_np_in.pl
brassI_pre_filter.pl
brassI_prep_bam.pl
brass.pl
Module
You can load the modules by:
module load biocontainers
module load brass
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Brass on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=brass
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers brass
brass.pl -c 4 -o myout -t tumour.bam -n normal.bam
Breseq
Introduction
Breseq
is a computational pipeline for the analysis of short-read re-sequencing data.
Versions
0.36.1
Commands
breseq
Module
You can load the modules by:
module load biocontainers
module load breseq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Breseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=breseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers breseq
BUSCO
Introduction
BUSCO
(Benchmarking sets of Universal Single-Copy Orthologs) provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs.
Detailed information can be found here: https://gitlab.com/ezlab/busco/
Versions
5.2.2
5.3.0
5.4.1
5.4.3
5.4.4
5.4.5
Commands
busco
generate_plot.py
Helper command
Note
Augustus is a gene prediction program for eukaryotes which is required by BUSCO. Augustus requires a writable configuration directory. This installation comes with a stub of AUGUSTUS coniguration files, but you must
copy them out from the container into a location where you have write permissions.
A helper command copy_augustus_config
is provided to simplify the task. Follow the procedure below to put the config files in your scratch space:
$ mkdir -p $RCAC_SCRATCH/augustus
$ copy_augustus_config $RCAC_SCRATCH/augustus
$ export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config
Module
You can load the modules by:
module load biocontainers
module load busco
Example job for prokaryotic genomes
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BUSCO on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=BUSCO
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers busco
## Print the full lineage datasets, and find the dataset fitting your organism.
busco --list-datasets
## run the evaluation
busco -f -c 12 -l actinobacteria_class_odb10 -i bacteria_genome.fasta -o busco_out -m genome
## generate a simple summary plot
generate_plot.py -wd busco_out
Example job for eukaryotic genomes
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BUSCO on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=BUSCO
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers busco
## The augustus config step is only required for the first time to use BUSCO
mkdir -p $RCAC_SCRATCH/augustus
copy_augustus_config $RCAC_SCRATCH/augustus
## This is required for eukaryotic genomes
export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config
## Print the full lineage datasets, and find the dataset fitting your organism.
busco --list-datasets
## run the evaluation
busco -f -c 12 -l fungi_odb10 -i fungi_protein.fasta -o busco_out_protein -m protein
busco -f -c 12 --augustus -l fungi_odb10 -i fungi_genome.fasta -o busco_out_genome -m genome
## generate a simple summary plot
generate_plot.py -wd busco_out_protein
generate_plot.py -wd busco_out_genome
Bustools
Introduction
Bustools
is a program for manipulating BUS files for single cell RNA-Seq datasets.
Versions
0.41.0
Commands
bustools
Module
You can load the modules by:
module load biocontainers
module load bustools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bustools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bustools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bustools
bustools capture -s -o cDNA_capture.bus -c cDNA_transcripts.to_capture.txt -e matrix.ec -t transcripts.txt output.correct.sort.bus
bustools count -o u -g cDNA_introns_t2g.txt -e matrix.ec -t transcripts.txt --genecounts cDNA_capture.bus
BWA
Introduction
BWA
(Burrows-Wheeler Aligner) is a fast, accurate, memory-efficient aligner for short and long sequencing reads.
Versions
0.7.17
Commands
bwa
qualfa2fq.pl
xa2multi.pl
Module
You can load the modules by:
module load biocontainers
module load bwa
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BWA on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bwa
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bwa
bwa index ref.fasta
bwa mem ref.fasta input.fq > test.sam
Bwameth
Introduction
Bwameth is a tool for fast and accurante alignment of BS-Seq reads.
Versions
0.2.5
Commands
bwameth.py
Module
You can load the modules by:
module load biocontainers
module load bwameth
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bwameth on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bwameth
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bwameth
Cactus
Introduction
Cactus
is a reference-free whole-genome multiple alignment program.
Versions
2.0.5
2.2.1
2.2.3-gpu
2.2.3
2.4.0-gpu
2.4.0
2.6.5
Commands
cactus
cactus-align
cactus-align-batch
cactus-blast
cactus-graphmap
cactus-graphmap-join
cactus-graphmap-split
cactus-minigraph
cactus-prepare
cactus-prepare-toil
cactus-preprocess
cactus-refmap
cactus2hal-stitch.sh
cactus2hal.py
cactusAPITests
cactus_analyseAssembly
cactus_barTests
cactus_batch_mergeChunks
cactus_chain
cactus_consolidated
cactus_covered_intervals
cactus_fasta_fragments.py
cactus_fasta_softmask_intervals.py
cactus_filterSmallFastaSequences.py
cactus_halGeneratorTests
cactus_local_alignment.py
cactus_makeAlphaNumericHeaders.py
cactus_softmask2hardmask
Module
You can load the modules by:
module load biocontainers
module load cactus
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cactus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cactus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cactus
wget https://raw.githubusercontent.com/ComparativeGenomicsToolkit/cactus/master/examples/evolverMammals.txt
cactus jobStore evolverMammals.txt evolverMammals.hal
Cafe
Introduction
Cafe
is a computational tool for the study of gene family evolution.
Versions
4.2.1
5.0.0
Commands
cafe
Module
You can load the modules by:
module load biocontainers
module load cafe
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cafe on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cafe
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cafe
#To get a list of commands just call CAFE with the -h or --help arguments
cafe5 -h
#To estimate lambda with no among family rate variation issue the command
cafe5 -i mammal_gene_families.txt -t mammal_tree.txt
Canu
Introduction
Canu
is a single molecule sequence assembler for genomes large and small.
Detailed usage can be found here: https://github.com/marbl/canu
Versions
2.1.1
2.2
Commands
canu
Module
You can load the modules by:
module load biocontainers
module load canu/2.2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run canu on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=canu
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers canu/2.2
canu -p Cm -d clavibacter_pacbio genomeSize=3.4m -pacbio *.fastq
Ccs
Introduction
Pbccs is a tool to generate Highly Accurate Single-Molecule Consensus Reads (HiFi Reads).
Versions
6.4.0
Commands
ccs
Module
You can load the modules by:
module load biocontainers
module load ccs
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ccs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ccs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ccs
ccs --all subreads.bam ccs.bam
Cdbtools
Introduction
Cdbtools
is a collection of tools used for creating indices for quick retrieval of any particular sequences from large multi-FASTA files.
Versions
0.99
Commands
cdbfasta
cdbyank
Module
You can load the modules by:
module load biocontainers
module load cdbtools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cdbtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cdbtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cdbtools
cdbfasta genome.fa
cdbyank -a 'seq_1' genome.fa.cidx
Cd-hit
Introduction
Cd-hit
is a very widely used program for clustering and comparing protein or nucleotide sequences.
Versions
4.8.1
Commands
FET.pl
cd-hit
cd-hit-2d
cd-hit-2d-para.pl
cd-hit-454
cd-hit-clstr_2_blm8.pl
cd-hit-div
cd-hit-div.pl
cd-hit-est
cd-hit-est-2d
cd-hit-para.pl
clstr2tree.pl
clstr2txt.pl
clstr2xml.pl
clstr_cut.pl
clstr_list.pl
clstr_list_sort.pl
clstr_merge.pl
clstr_merge_noorder.pl
clstr_quality_eval.pl
clstr_quality_eval_by_link.pl
clstr_reduce.pl
clstr_renumber.pl
clstr_rep.pl
clstr_reps_faa_rev.pl
clstr_rev.pl
clstr_select.pl
clstr_select_rep.pl
clstr_size_histogram.pl
clstr_size_stat.pl
clstr_sort_by.pl
clstr_sort_prot_by.pl
clstr_sql_tbl.pl
clstr_sql_tbl_sort.pl
make_multi_seq.pl
plot_2d.pl
plot_len1.pl
Module
You can load the modules by:
module load biocontainers
module load cd-hit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cd-hit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cd-hit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cd-hit
cd-hit -i Cm_pep.fasta -o Cmdb90 -c 0.9 -n 5 -M 16000 -T 8
cd-hit-est -i Cm_dna.fasta -o Cmdb90_nt -c 0.9 -n 5 -M 16000 -T 8
Cegma
Introduction
CEGMA (Core Eukaryotic Genes Mapping Approach) is a pipeline for building a set of high reliable set of gene annotations in virtually any eukaryotic genome.
Versions
2.5
Commands
cegma
Module
You can load the modules by:
module load biocontainers
module load cegma
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cegma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cegma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cegma
cegma --genome genome.fasta -o output
Cellbender
Introduction
Cellbender
is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
Versions
0.2.0
0.2.2
Commands
cellbender
Module
You can load the modules by:
module load biocontainers
module load cellbender
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cellbender on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=cellbender
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellbender
cellbender remove-background \
--input cellranger/test_count/run_count_1kpbmcs/outs/raw_feature_bc_matrix.h5 \
--output output_cpu.h5 \
--expected-cells 1000 \
--total-droplets-included 20000 \
--fpr 0.01 \
--epochs 150
Cellphonedb
Introduction
CellPhoneDB is a publicly available repository of curated receptors, ligands and their interactions.
Versions
2.1.7
Commands
cellphonedb
Module
You can load the modules by:
module load biocontainers
module load cellphonedb
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellphonedb on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cellphonedb
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellphonedb
Cellranger
Introduction
Cellranger
is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.
Detailed usage can be found here: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger.
Versions
6.0.1
6.1.1
6.1.2
7.0.0
7.0.1
7.1.0
Commands
cellranger mkfastq
cellranger count
cellranger aggr
cellranger reanalyze
cellranger multi
Module
You can load the modules by:
module load biocontainers
module load cellranger
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellranger our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 48
#SBATCH --job-name=cellranger
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellranger
cellranger count --id=run_count_1kpbmcs --fastqs=pbmc_1k_v3_fastqs --sample=pbmc_1k_v3 --transcriptome=refdata-gex-GRCh38-2020-A
Cellranger-arc
Introduction
Cell Ranger ARC is a set of analysis pipelines that process Chromium Single Cell Multiome ATAC + Gene Expression sequencing data to generate a variety of analyses pertaining to gene expression (GEX), chromatin accessibility, and their linkage. Furthermore, since the ATAC and GEX measurements are on the very same cell, we are able to perform analyses that link chromatin accessibility and GEX.
Versions
2.0.2
Commands
cellranger-arc
Module
You can load the modules by:
module load biocontainers
module load cellranger-arc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellranger-arc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cellranger-arc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellranger-arc
Cellranger-atac
Introduction
Cellranger-atac
is a set of analysis pipelines that process Chromium Single Cell ATAC data.
Versions
2.0.0
2.1.0
Commands
cellranger-atac
Module
You can load the modules by:
module load biocontainers
module load cellranger-atac
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cellranger-atac on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --mem=64G
#SBATCH --job-name=cellranger-atac
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellranger-atac
cellranger-atac count --id=sample345 \
--reference=refdata-cellranger-arc-GRCh38-2020-A-2.0.0 \
--fastqs=runs/HAWT7ADXX/outs/fastq_path \
--sample=mysample \
--localcores=8 \
--localmem=64
Cellranger-dna
Introduction
Cell Ranger DNA is a set of analysis pipelines that process Chromium single cell DNA sequencing output to align reads, identify copy number variation (CNV), and compare heterogeneity among cells.
Versions
1.1.0
Commands
cellranger-dna
Module
You can load the modules by:
module load biocontainers
module load cellranger-dna
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellranger-dna on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cellranger-dna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellranger-dna
CellRank

Introduction
CellRank
a toolkit to uncover cellular dynamics based on Markov state modeling of single-cell data.
Detailed information about CellRank can be found here: https://cellrank.readthedocs.io/en/stable/.
Versions
1.5.1
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load cellrank/1.5.1
Note
The CellRank container also contained scVelo and scanpy. When you want to use CellRank, do not load scVelo or scanpy.
Interactive job
To run CellRank interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cellrank/1.5.1
(base) UserID@bell-a008:~ $ python
Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:41:03)
[GCC 9.4.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scanpy as sc
>>> import scvelo as scv
>>> import cellrank as cr
>>> import numpy as np
>>> scv.settings.verbosity = 3
>>> scv.settings.set_figure_params("scvelo")
>>> cr.settings.verbosity = 2
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=cellrank
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellrank/1.5.1
python script.py
CellRank-krylov

Introduction
CellRank
a toolkit to uncover cellular dynamics based on Markov state modeling of single-cell data. CellRank-krylov
is CellRank
installed with extra libraries, enabling it to have better performance for large datasets (>15k cells).
Detailed information about CellRank can be found here: https://cellrank.readthedocs.io/en/stable/.
Versions
1.5.1
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load cellrank-krylov/1.5.1
Note
The CellRank container also contained scVelo and scanpy. When you want to use CellRank, do not load scVelo or scanpy.
Interactive job
To run CellRank-krylov interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cellrank-krylov/1.5.1
(base) UserID@bell-a008:~ $ python
Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:41:03)
[GCC 9.4.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scanpy as sc
>>> import scvelo as scv
>>> import cellrank as cr
>>> import numpy as np
>>> scv.settings.verbosity = 3
>>> scv.settings.set_figure_params("scvelo")
>>> cr.settings.verbosity = 2
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=cellrank-krylov
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellrank-krylov/1.5.1
python script.py
cellSNP
Introduction
cellSNP
aims to pileup the expressed alleles in single-cell or bulk RNA-seq data, which can be directly used for donor deconvolution in multiplexed single-cell RNA-seq data, particularly with vireo, which assigns cells to donors and detects doublets, even without genotyping reference.
Versions
1.2.2
Commands
cellsnp-lite
Module
You can load the modules by:
module load biocontainers
module load cellsnp-lite
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellSNP on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=cellsnp-lite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellsnp-lite
cellsnp-lite -s sample.bam -b barcode.tsv -O cellsnp_out -p 8 --minMAF 0.1 --minCOUNT 100
Celltypist
Introduction
Celltypist
is a tool for semi-automatic cell type annotation.
Versions
0.2.0
1.1.0
Commands
celltypist
python
python3
Module
You can load the modules by:
module load biocontainers
module load celltypist
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Celltypist on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=celltypist
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers celltypist
celltypist --indata demo_2000_cells.h5ad --model Immune_All_Low.pkl --outdir output
Centrifuge
Introduction
Centrifuge
is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers.
Versions
1.0.4_beta
Commands
centrifuge
centrifuge-BuildSharedSequence.pl
centrifuge-RemoveEmptySequence.pl
centrifuge-RemoveN.pl
centrifuge-build
centrifuge-build-bin
centrifuge-class
centrifuge-compress.pl
centrifuge-download
centrifuge-inspect
centrifuge-inspect-bin
centrifuge-kreport
centrifuge-sort-nt.pl
centrifuge_evaluate.py
centrifuge_simulate_reads.py
Module
You can load the modules by:
module load biocontainers
module load centrifuge
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Centrifuge on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=centrifuge
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers centrifuge
centrifuge-download -o taxonomy taxonomy
centrifuge-download -o library -m -d "archaea,bacteria,viral" refseq > seqid2taxid.map
cat library/*/*.fna > input-sequences.fna
centrifuge-build -p 8 --conversion-table seqid2taxid.map \
--taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp \
input-sequences.fna abv
Cfsan-snp-pipeline
Introduction
The CFSAN SNP Pipeline is a Python-based system for the production of SNP matrices from sequence data used in the phylogenetic analysis of pathogenic organisms sequenced from samples of interest to food safety.
Versions
2.2.1
Commands
cfsan_snp_pipeline
Module
You can load the modules by:
module load biocontainers
module load cfsan-snp-pipeline
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cfsan-snp-pipeline on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cfsan-snp-pipeline
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cfsan-snp-pipeline
Checkm-genome
Introduction
CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes.
Versions
1.2.0
1.2.2
Commands
checkm-genome
Module
You can load the modules by:
module load biocontainers
module load checkm-genome
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run checkm-genome on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=checkm-genome
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers checkm-genome
checkm lineage_wf -t 8 -x fa bins checkm
Chewbbaca
Introduction
chewBBACA is a comprehensive pipeline including a set of functions for the creation and validation of whole genome and core genome MultiLocus Sequence Typing (wg/cgMLST) schemas, providing an allele calling algorithm based on Blast Score Ratio that can be run in multiprocessor settings and a set of functions to visualize and validate allele variation in the loci. chewBBACA performs the schema creation and allele calls on complete or draft genomes resulting from de novo assemblers.
Versions
2.8.5
Commands
chewBBACA.py
Module
You can load the modules by:
module load biocontainers
module load chewbbaca
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run chewbbaca on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=chewbbaca
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers chewbbaca
chewBBACA.py CreateSchema -i complete_genomes/ -o tutorial_schema --ptf Streptococcus_agalactiae.trn --cpu 4
chewBBACA.py AlleleCall -i complete_genomes/ -g tutorial_schema/schema_seed -o results32_wgMLST --cpu 4
Chopper
Introduction
Chopper is Rust implementation of NanoFilt+NanoLyse, both originally written in Python. This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file. Filtering is done on average read quality and minimal or maximal read length, and applying a headcrop (start of read) and tailcrop (end of read) while printing the reads passing the filter.
Versions
0.2.0
Commands
chopper
Module
You can load the modules by:
module load biocontainers
module load chopper
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run chopper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=chopper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers chopper
Chromap
Introduction
Chromap is an ultrafast method for aligning and preprocessing high throughput chromatin profiles.
Versions
0.2.2
Commands
chromap
Module
You can load the modules by:
module load biocontainers
module load chromap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run chromap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=chromap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers chromap
CICERO
Introduction
CICERO
(Clipped-reads Extended for RNA Optimization) is an assembly-based algorithm to detect diverse classes of driver gene fusions from RNA-seq.
Versions
1.8.1
Commands
Cicero.sh
Module
You can load the modules by:
module load biocontainers
module load cicero
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run CICERO on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cicero
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cicero
Circexplorer2
Introduction
CIRCexplorer2 is a comprehensive and integrative circular RNA analysis toolset. It is the successor of CIRCexplorer with plenty of new features to facilitate circular RNA identification and characterization.
Versions
2.3.8
Commands
CIRCexplorer2
fast_circ.py
fetch_ucsc.py
Module
You can load the modules by:
module load biocontainers
module load circexplorer2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run circexplorer2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circexplorer2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers circexplorer2
Circlator
Introduction
Circlator
is a tool to circularize genome assemblies.
Versions
1.5.5
Commands
circlator
python3
Module
You can load the modules by:
module load biocontainers
module load circlator
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Circlator on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circlator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers circlator
circlator minimus2 minimus2_test_run_minimus2.in.fa minimus2_test
Circompara2
Introduction
CirComPara2 is a computational pipeline to detect, quantify, and correlate expression of linear and circular RNAs from RNA-seq data that combines multiple circRNA-detection methods.
Versions
0.1.2.1
Commands
python
Rscript
circompara2
CIRCexplorer2
CIRCexplorer_compare.R
CIRI.pl
DCC
DCC_patch_CombineCounts.py
QRE_finder.py
STAR
bedtools
bowtie
bowtie-build
bowtie-inspect
bowtie2
bowtie2-build
bowtie2-inspect
bwa
ccp_circrna_expression.R
cfinder_compare.R
chimoutjunc_to_bed.py
ciri_compare.R
collect_read_stats.R
convert_circrna_collect_tables.py
cuffcompare
cuffdiff
cufflinks
cuffmerge
cuffnorm
cuffquant
dcc_compare.R
dcc_fix_strand.R
fasta_len.py
fastq_rev_comp.py
fastqc
filterCirc.awk
filterSpliceSiteCircles.pl
filter_and_cast_circexp.R
filter_fastq_reads.py
filter_findcirc_res.R
filter_segemehl.R
find_circ.py
findcirc_compare.R
gene_annotation.R
get_ce2_bwa_bks_reads.R
get_ce2_bwa_circ_reads.py
get_ce2_segemehl_bks_reads.R
get_ce2_star_bks_reads.R
get_ce2_th_bks_reads.R
get_circompara_counts.R
get_circrnaFinder_bks_reads.R
get_ciri_bks_reads.R
get_dcc_bks_reads.R
get_findcirc_bks_reads.R
get_gene_expression_files.R
get_stringtie_rawcounts.R
gffread
gtfToGenePred
gtf_collapse_features.py
gtf_to_sam
haarz.x
hisat2
hisat2-build
htseq-count
install_R_libs.R
nrForwardSplicedReads.pl
parallel
pip
postProcessStarAlignment.pl
samtools
samtools_v0
scons
segemehl.x
split_start_end_gtf.py
starCirclesToBed.pl
stringtie
testrealign_compare.R
tophat2
trim_read_header.py
trimmomatic-0.39.jar
unmapped2anchors.py
cf_filterChimout.awk
circompara
get_unmapped_reads_from_bam.sh
install_circompara
make_circrna_html
make_indexes
Module
You can load the modules by:
module load biocontainers
module load circompara2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run circompara2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circompara2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers circompara2
Circos
Introduction
Circos
is a software package for visualizing data and information.
Versions
0.69.8
Commands
circos
Module
You can load the modules by:
module load biocontainers
module load circos
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Circos on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circos
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers circos
circos -conf circos.conf
Ciri2
Introduction
CIRI2: Circular RNA identification based on multiple seed matching
Versions
2.0.6
Commands
CIRI2.pl
Module
You can load the modules by:
module load biocontainers
module load ciri2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ciri2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ciri2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ciri2
CIRIquant
Introduction
CIRIquant
is a comprehensive analysis pipeline for circRNA detection and quantification in RNA-Seq data.
Versions
1.1.2
Commands
CIRIquant
Module
You can load the modules by:
module load biocontainers
module load ciriquant
config.yml
All required dependencies have been installed within the CIRIquant container image. But users still need toprovide the PATH of these exectuables in config.yml. Please use the below config.yml as example:
name: hg38
tools:
bwa: /bin/bwa
hisat2: /bin/hisat2
stringtie: /bin/stringtie
samtools: /usr/local/bin/samtools
reference:
fasta: reference/Homo_sapiens.GRCh38.dna.primary_assembly.fa
gtf: reference/Homo_sapiens.GRCh38.105.gtf
bwa_index: reference/Homo_sapiens.GRCh38.dna.primary_assembly.fa
hisat_index: reference/hg38_hisat2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run CIRIquant on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 64
#SBATCH --job-name=ciriquant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ciriquant
CIRIquant -t 64 -1 SRR12095148_1.fastq -2 SRR12095148_2.fastq --config config.yml -o Output -p test
Clair3
Introduction
Clair3 is a germline small variant caller for long-reads. Clair3 makes the best of two major method categories: pileup calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 runs fast and has superior performance, especially at lower coverage. Clair3 is simple and modular for easy deployment and integration.
Versions
0.1-r11
0.1-r12
Commands
run_clair3.sh
Module
You can load the modules by:
module load biocontainers
module load clair3
Model_path
Note
model_path
is in /opt/models/
. The parameter will be like this --model_path="/opt/models/MODEL_NAME"
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clair3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clair3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clair3
run_clair3.sh \
--bam_fn=input.bam \
--ref_fn=ref.fasta \
--threads=12 \
--platform=ont \
--model_path="/opt/models/ont" \
--output=output
Clairvoyante
Introduction
Clairvoyante is a deep neural network based variant caller.
Versions
1.02
Commands
clairvoyante.py
Module
You can load the modules by:
module load biocontainers
module load clairvoyante
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clairvoyante on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clairvoyante
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clairvoyante
cd training
clairvoyante.py callVarBam \
--chkpnt_fn ../trainedModels/fullv3-illumina-novoalign-hg001+hg002-hg38/learningRate1e-3.epoch500 \
--bam_fn ../testingData/chr21/chr21.bam \
--ref_fn ../testingData/chr21/chr21.fa \
--bed_fn ../testingData/chr21/chr21.bed \
--call_fn chr21_calls.vcf \
--ctgName chr21
Clearcnv
Introduction
ClearCNV: CNV calling from NGS panel data in the presence of ambiguity and noise.
Versions
0.306
Commands
clearCNV
Module
You can load the modules by:
module load biocontainers
module load clearcnv
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clearcnv on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clearcnv
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clearcnv
Clever-toolkit
Introduction
Clever-toolkit is a collection of tools to discover and genotype structural variations in genomes from paired-end sequencing reads. The main software is written in C++ with some auxiliary scripts in Python.
Versions
2.4
Commands
clever
laser
bam-to-alignment-priors
split-priors-by-chromosome
clever-core
postprocess-predictions
evaluate-sv-predictions
split-reads
laser-core
laser-recalibrate
genotyper
insert-length-histogram
add-score-tags-to-bam
bam2fastq
remove-redundant-variations
precompute-distributions
extract-bad-reads
filter-variations
merge-to-vcf
multiline-to-xa
filter-bam
read-group-stats
Module
You can load the modules by:
module load biocontainers
module load clever-toolkit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clever-toolkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clever-toolkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clever-toolkit
cat mapped.bam | bam2fastq output_1.fq output_2.fq
Clonalframeml
Introduction
ClonalFrameML is a software package that performs efficient inference of recombination in bacterial genomes.
Versions
1.11
Commands
ClonalFrameML
Module
You can load the modules by:
module load biocontainers
module load clonalframeml
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clonalframeml on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clonalframeml
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clonalframeml
Clust
Introduction
Clust is a fully automated method for identification of clusters (groups) of genes that are consistently co-expressed (well-correlated) in one or more heterogeneous datasets from one or multiple species.
Versions
1.17.0
Commands
clust
Module
You can load the modules by:
module load biocontainers
module load clust
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clust on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clust
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clust
Clustalw
Introduction
Clustalw
is a general purpose multiple alignment program for DNA or proteins.
Versions
2.1
Commands
clustalw
Module
You can load the modules by:
module load biocontainers
module load clustalw
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Clustalw on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clustalw
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clustalw
clustalw -tree -align -infile=seq.faa
CNVkit
Introduction
CNVkit
is a command-line toolkit and Python library for detecting copy number variants and alterations genome-wide from high-throughput sequencing.
Versions
0.9.9-py
Commands
cnvkit.py
cnv_annotate.py
cnv_expression_correlate.py
cnv_updater.py
Module
You can load the modules by:
module load biocontainers
module load cnvkit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run CNVkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cnvkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cnvkit
cnvkit.py batch *Tumor.bam --normal *Normal.bam \
--targets my_baits.bed --fasta hg19.fasta \
--access data/access-5kb-mappable.hg19.bed \
--output-reference my_reference.cnn
--output-dir example/
Cnvnator
Introduction
Cnvnator
is a tool for discovery and characterization of copy number variation (CNV) in population genome sequencing data.
Versions
0.4.1
Commands
cnvnator
cnvnator2VCF.pl
plotbaf.py
plotcircular.py
plotrdbaf.py
pytools.py
Module
You can load the modules by:
module load biocontainers
module load cnvnator
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cnvnator on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cnvnator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cnvnator
cnvnator -root file.root -tree file.bam -chrom $(seq 1 22) X Y
plotcircular.py file.root
Coinfinder
Introduction
Coinfinder is an algorithm and software tool that detects genes which associate and dissociate with other genes more often than expected by chance in pangenomes.
Versions
1.2.0
Commands
coinfinder
Module
You can load the modules by:
module load biocontainers
module load coinfinder
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run coinfinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=coinfinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers coinfinder
coinfinder -i coinfinder-manuscript/gene_presence_absence.csv \
-I -p coinfinder-manuscript/core-gps_fasttree.newick \
-o output
CONCOCT
Introduction
CONCOCT
: Clustering cONtigs with COverage and ComposiTion.
Detailed usage can be found here: https://github.com/BinPro/CONCOCT
Versions
1.1.0
Commands
concoct
concoct_refine
concoct_coverage_table.py
cut_up_fasta.py
extract_fasta_bins.py
merge_cutup_clustering.py
Module
You can load the modules by:
module load biocontainers
module load concoct/1.1.0-py38
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run concoct on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=concoct
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers concoct/1.1.0-py38
cut_up_fasta.py final.contigs.fa -c 10000 -o 0 --merge_last -b contigs_10K.bed > contigs_10K.fa
concoct_coverage_table.py contigs_10K.bed SRR1976948_sorted.bam > coverage_table.tsv
concoct --composition_file contigs_10K.fa --coverage_file coverage_table.tsv -b concoct_output/
Control-freec
Introduction
Control-freec
is a tool for detection of copy-number changes and allelic imbalances (including LOH) using deep-sequencing data.
Versions
11.6
Commands
freec
Module
You can load the modules by:
module load biocontainers
module load control-freec
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Control-freec on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=control-freec
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers control-freec
freec -conf config_chr19.txt

Cooler
Introduction
Cooler
is a support library for a sparse, compressed, binary persistent storage format, also called cooler, used to store genomic interaction data, such as Hi-C contact matrices.
Versions
0.8.11
Commands
cooler
python
python3
Module
You can load the modules by:
module load biocontainers
module load Cooler
Interactive job
To run Cooler interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cooler
(base) UserID@bell-a008:~ $ python
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cooler
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cooler batch jobs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cooler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cooler
cooler info data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler info -f bin-size data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler info -m data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler tree data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler attrs data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
Coverm
Introduction
Coverm
is a configurable, easy to use and fast DNA read coverage and relative abundance calculator focused on metagenomics applications.
Versions
0.6.1
Commands
coverm
Module
You can load the modules by:
module load biocontainers
module load coverm
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Coverm on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=coverm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers coverm
coverm genome --genome-fasta-files xcc.fasta --coupled SRR11234553_1.fastq SRR11234553_2.fastq
Covgen
Introduction
Covgen creates a target specific exome_full192.coverage.txt file required by MutSig.
Versions
1.0.2
Commands
CovGen
Module
You can load the modules by:
module load biocontainers
module load covgen
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run covgen on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=covgen
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers covgen
Cramino
Introduction
Cramino is a tool for quick quality assessment of cram and bam files, intended for long read sequencing.
Versions
0.9.6
Commands
cramino
Module
You can load the modules by:
module load biocontainers
module load cramino
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cramino on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cramino
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cramino
CRISPRCasFinder
Introduction
CRISPRCasFinder
enables the easy detection of CRISPRs and cas genes in user-submitted sequence data. It is an updated, improved, and integrated version of CRISPRFinder and CasFinder.
Detailed usage can be found here: https://github.com/dcouvin/CRISPRCasFinder
Versions
4.2.20
Commands
CRISPRCasFinder.pl
Module
You can load the modules by:
module load biocontainers
module load crisprcasfinder/4.2.20
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run CRISPRCasFinder on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=CRISPRCasFinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers crisprcasfinder/4.2.20
CRISPRCasFinder.pl -in install_test/sequence.fasta -cas -cf CasFinder-2.0.3 -def G -keep
Crispresso2
Introduction
CRISPResso2 is a software pipeline designed to enable rapid and intuitive interpretation of genome editing experiments.
Versions
2.2.10
2.2.11a
2.2.8
2.2.9
Commands
CRISPResso
CRISPRessoAggregate
CRISPRessoBatch
CRISPRessoCompare
CRISPRessoPooled
CRISPRessoPooledWGSCompare
CRISPRessoWGS
Module
You can load the modules by:
module load biocontainers
module load crispresso2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run crispresso2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=crispresso2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers crispresso2
CRISPResso --fastq_r1 nhej.r1.fastq.gz --fastq_r2 nhej.r2.fastq.gz -n nhej --amplicon_seq \
AATGTCCCCCAATGGGAAGTTCATCTGGCACTGCCCACAGGTGAGGAGGTCATGATCCCCTTCTGGAGCTCCCAACGGGCCGTGGTCTGGTTCATCATCTGTAAGAATGGCTTCAAGAGGCTCGGCTGTGGTT
Crispritz
Introduction
Crispritz
is a software package containing 5 different tools dedicated to perform predictive analysis and result assessement on CRISPR/Cas experiments.
Versions
2.6.5
Commands
crispritz.py
Module
You can load the modules by:
module load biocontainers
module load crispritz
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Crispritz on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=crispritz
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers crispritz
crispritz.py add-variants hg38_1000genomeproject_vcf/ hg38_ref/ &> output.redirect.out
crispritz.py index-genome hg38_ref hg38_ref/ 20bp-NGG-SpCas9.txt -bMax 2 &> output.redirect.out
crispritz.py search hg38_ref/ 20bp-NGG-SpCas9.txt EMX1.sgRNA.txt emx1.hg38 -mm 4 -t -scores hg38_ref/ &> output.redirect.out
crispritz.py search genome_library/NGG_2_hg38_ref/ 20bp-NGG-SpCas9.txt EMX1.sgRNA.txt emx1.hg38.bulges -index -mm 4 -bDNA 1 -bRNA 1 -t &> output.redirect.out
crispritz.py annotate-results emx1.hg38.targets.txt hg38Annotation.bed emx1.hg38 &> output.redirect.out
Crossmap
Introduction
Crossmap
is a program for genome coordinates conversion between different assemblies.
Versions
0.6.3
Commands
CrossMap.py
Module
You can load the modules by:
module load biocontainers
module load crossmap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Crossmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=crossmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers crossmap
CrossMap.py bed GRCh37_to_GRCh38.chain.gz test.bed
cross_match
Introduction
cross_match
is a general purpose utility for comparing any two DNA sequence sets using a ‘banded’ version of swat.
Versions
1.090518
Commands
cross_match
Module
You can load the modules by:
module load biocontainers
module load cross_match
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cross_match on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cross_match
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cross_match
Csvkit
Introduction
csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.
Versions
1.1.1
Commands
csvclean
csvcut
csvformat
csvgrep
csvjoin
csvjson
csvlook
csvpy
csvsort
csvsql
csvstack
csvstat
in2csv
sql2csv
Module
You can load the modules by:
module load biocontainers
module load csvkit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run csvkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=csvkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers csvkit
Csvtk
Introduction
Csvtk
is a cross-platform, efficient and practical CSV/TSV toolkit.
Versions
0.23.0
0.25.0
Commands
csvtk
Module
You can load the modules by:
module load biocontainers
module load csvtk
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Csvtk on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=csvtk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers csvtk
cat data.csv \
| csvtk summary --ignore-non-digits --fields f4:sum,f5:sum --groups f1,f2 \
| csvtk pretty
Cufflinks
Introduction
Cufflinks
assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.
Versions
2.2.1
Commands
cuffcompare
cuffdiff
cufflinks
cuffmerge
cuffnorm
cuffquant
gffread
gtf_to_sam
Module
You can load the modules by:
module load biocontainers
module load cufflinks
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cufflinks on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=cufflinks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cufflinks
cufflinks -p 8 -G transcript.gtf --library-type fr-unstranded -o cufflinks_output tophat_out/accepted_hits.bam
Cutadapt
Introduction
Cutadapt
finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
Versions
3.4
3.7
Commands
cutadapt
Module
You can load the modules by:
module load biocontainers
module load cutadapt
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cutadapt on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cutadapt
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cutadapt
cutadapt -a AACCGGTT -o output.fastq input.fastq
Cuttlefish
Introduction
Cuttlefish is a fast, parallel, and very lightweight memory tool to construct the compacted de Bruijn graph from sequencing reads or reference sequences. It is highly scalable in terms of the size of the input data.
Versions
2.1.1
Commands
cuttlefish
Module
You can load the modules by:
module load biocontainers
module load cuttlefish
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cuttlefish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cuttlefish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cuttlefish
Cyvcf2
Introduction
Cyvcf2
is a cython wrapper around htslib built for fast parsing of Variant Call Format (VCF) files.
Versions
0.30.14
Commands
cyvcf2
python
python3
Module
You can load the modules by:
module load biocontainers
module load cyvcf2
Interactive job
To run Cyvcf2 interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n1 -t1:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers scanpy/1.8.2
(base) UserID@bell-a008:~ $ python
Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cyvcf2 import VCF
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cyvcf2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cyvcf2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cyvcf2
cyvcf2 --help
cyvcf2 [OPTIONS] <vcf_file>
Das_tool
Introduction
DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.
Versions
1.1.6
Commands
DAS_Tool
Contigs2Bin_to_Fasta.sh
Fasta_to_Contig2Bin.sh
get_species_taxids.sh
Module
You can load the modules by:
module load biocontainers
module load das_tool
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run das_tool on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=das_tool
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers das_tool
DAS_Tool -i sample.human.gut_concoct_contigs2bin.tsv,\
sample.human.gut_maxbin2_contigs2bin.tsv,\
sample.human.gut_metabat_contigs2bin.tsv,\
sample.human.gut_tetraESOM_contigs2bin.tsv \
-l concoct,maxbin,metabat,tetraESOM \
-c sample.human.gut_contigs.fa \
-o DASToolRun2 \
--proteins DASToolRun1_proteins.faa \
--write_bin_evals \
--threads 4 \
--score_threshold 0.6
Dbg2olc
Introduction
Dbg2olc
is used for efficient assembly of large genomes using long erroneous reads of the third generation sequencing technologies.
Versions
20180222
20200723
Commands
AssemblyStatistics
DBG2OLC
RunSparcConsensus.txt
SelectLongestReads
SeqIO.py
Sparc
SparseAssembler
split_and_run_sparc.sh
split_and_run_sparc.sh.bak
split_reads_by_backbone.py
Module
You can load the modules by:
module load biocontainers
module load dbg2olc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Dbg2olc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dbg2olc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dbg2olc
SelectLongestReads sum 600000000 longest 0 o TEST.fq f SRR1976948.abundtrim.subset.pe.fq
Debreak
Introduction
Debreak is a SV caller for long-read single-molecular sequencing data.
Versions
1.3
Commands
debreak
Module
You can load the modules by:
module load biocontainers
module load debreak
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run debreak on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=debreak
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers debreak
Deconseq
Introduction
DeconSeq: DECONtamination of SEQuence data using a modified version of BWA-SW. The DeconSeq tool can be used to automatically detect and efficiently remove sequence contamination from genomic and metagenomic datasets. It is easily configurable and provides a user-friendly interface.
Versions
0.4.3
Commands
bwa64
deconseq.pl
splitFasta.pl
Module
You can load the modules by:
module load biocontainers
module load deconseq
Helper command
Note
Users need to use DeconSeqConfig.pm
to specify the database information. Besides, for the current deconseq
module in biocontainers, users need to copy the executables to your current directory, including bwa64
, deconseq.pl
, and splitFasta.pl
. This step is only needed to run once.
A helper command copy_DeconSeqConfig
is provided to copy the configuration file DeconSeqConfig.pm
and executables to your current directory. You just need to run the command copy_DeconSeqConfig
and modify DeconSeqConfig.pm
as needed:
copy_DeconSeqConfig
nano DeconSeqConfig.pm # modify database information as needed
For detailed information about how to config DeconSeqConfig.pm
, please check its online manual (https://sourceforge.net/projects/deconseq/files/).
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run deconseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deconseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deconseq
bwa64 index -p hg38_db -a bwtsw Homo_sapiens.GRCh38.dna.fa
bwa64 index -p m39_db -a bwtsw GRCm38.p4.genome.fa
deconseq.pl -f input.fastq -dbs hg38_db -dbs_retain m39_db
Deepbgc
Introduction
Deepbgc
is a tool for BGC detection and classification using deep learning.
Versions
0.1.26
0.1.30
Commands
deepbgc
Module
You can load the modules by:
module load biocontainers
module load deepbgc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Deepbgc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deepbgc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deepbgc
export DEEPBGC_DOWNLOADS_DIR=$PWD
deepbgc download
deepbgc pipeline genome.fa -o output
Deepconsensus
Introduction
DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
Versions
0.2.0
Commands
deepconsensus
ccs
actc
Module
You can load the modules by:
module load biocontainers
module load deepconsensus
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run deepconsensus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deepconsensus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deepconsensus
deepconsensus run \
--subreads_to_ccs=subreads_to_ccs.bam \
--ccs_fasta=ccs.fasta \
--checkpoint=checkpoint-50 \
--output=output.fastq \
--batch_zmws=100
Deepsignal2
Introduction
Deepsignal2
is a deep-learning method for detecting DNA methylation state from Oxford Nanopore sequencing reads.
Versions
0.1.2
Commands
deepsignal2
call_modification_frequency.py
combine_call_mods_freq_files.py
combine_two_strands_frequency.py
concat_two_files.py
evaluate_mods_call.py
filter_samples_by_label.py
filter_samples_by_positions.py
gff_reader.py
randsel_file_rows.py
shuffle_a_big_file.py
split_freq_file_by_5mC_motif.py
txt_formater.py
Module
You can load the modules by:
module load biocontainers
module load deepsignal2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Deepsignal2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deepsignal2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deepsignal2
DeepTools
Introduction
DeepTools
is a collection of user-friendly tools for normalization and visualization of deep-sequencing data.
Versions
3.5.1-py
Commands
alignmentSieve
bamCompare
bamCoverage
bamPEFragmentSize
bigwigCompare
computeGCBias
computeMatrix
computeMatrixOperations
correctGCBias
deeptools
estimateReadFiltering
estimateScaleFactor
multiBamSummary
multiBigwigSummary
plotCorrelation
plotCoverage
plotEnrichment
plotFingerprint
plotHeatmap
plotPCA
plotProfile
Module
You can load the modules by:
module load biocontainers
module load deeptools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run DeepTools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=deeptools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deeptools
bamCoverage --normalizeUsing CPM -p 32 \
--effectiveGenomeSize 11000000 \
-b WT_coord_sorted.bam \
-o WT_coord_sorted.bw
Deepvariant
Introduction
DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file.
Versions
1.0.0
1.1.0
Commands
call_variants
get-pip.py
make_examples
model_eval
model_train
postprocess_variants
run-prereq.sh
run_deepvariant
run_deepvariant.py
settings.sh
show_examples
vcf_stats_report
Module
You can load the modules by:
module load biocontainers
module load deepvariant
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run deepvariant on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=deepvariant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deepvariant
INPUT_DIR="${PWD}/quickstart-testdata"
DATA_HTTP_DIR="https://storage.googleapis.com/deepvariant/quickstart-testdata"
mkdir -p ${INPUT_DIR}
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/NA12878_S1.chr20.10_10p1mb.bam
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/NA12878_S1.chr20.10_10p1mb.bam.bai
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/test_nist.b37_chr20_100kbp_at_10mb.bed
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/test_nist.b37_chr20_100kbp_at_10mb.vcf.gz
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/test_nist.b37_chr20_100kbp_at_10mb.vcf.gz.tbi
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.fai
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.gz
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.gz.fai
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.gz.gzi
run_deepvariant --model_type=WGS --ref="${INPUT_DIR}"/ucsc.hg19.chr20.unittest.fasta --reads="${INPUT_DIR}"/NA12878_S1.chr20.10_10p1mb.bam --regions "chr20:10,000,000-10,010,000" --output_vcf="output/output.vcf.gz" --output_gvcf="output/output.g.vcf.gz" --intermediate_results_dir "output/intermediate_results_dir" --num_shards=4
Delly
Introduction
Delly
is an integrated structural variant (SV) prediction method that can discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data.
Versions
0.9.1
1.0.3
1.1.3
1.1.5
1.1.6
Commands
delly
Module
You can load the modules by:
module load biocontainers
module load delly
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Delly on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=delly
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers delly
delly call -x hg19.excl -o delly.bcf -g hg19.fa input.bam
delly filter -f somatic -o t1.pre.bcf -s samples.tsv t1.bcf
Dendropy
Introduction
DendroPy is a Python library for phylogenetic computing. It provides classes and functions for the simulation, processing, and manipulation of phylogenetic trees and character matrices, and supports the reading and writing of phylogenetic data in a range of formats, such as NEXUS, NEWICK, NeXML, Phylip, FASTA, etc. Application scripts for performing some useful phylogenetic operations, such as data conversion and tree posterior distribution summarization, are also distributed and installed as part of the libary. DendroPy can thus function as a stand-alone library for phylogenetics, a component of more complex multi-library phyloinformatic pipelines, or as a scripting “glue” that assembles and drives such pipelines.
Versions
4.5.2
Commands
python
python3
sumtrees.py
Module
You can load the modules by:
module load biocontainers
module load dendropy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run dendropy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dendropy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dendropy
Diamond
Introduction
Diamond
is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data. The key features are:
Pairwise alignment of proteins and translated DNA at 100x-10,000x speed of BLAST.
Frameshift alignments for long read analysis.
Low resource requirements and suitable for running on standard desktops or laptops.
Various output formats, including BLAST pairwise, tabular and XML, as well as taxonomic classification.
Detailed about its usage can be found here: https://github.com/bbuchfink/diamond
Versions
2.0.13
2.0.14
2.0.15
2.1.6
Commands
diamond makedb
diamond prepdb
diamond blastp
diamond blastx
diamond view
diamond version
diamond dbinfo
diamond help
diamond test
Module
You can load the modules by:
module load biocontainers
module load diamond/2.0.14
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run diamond on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=diamond
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers diamond/2.0.14
diamond makedb --in uniprot_sprot.fasta -d uniprot_sprot
diamond blastp -p 24 -q test.faa -d uniprot_sprot --very-sensitive -o blastp_output.txt
Dnaapler
Introduction
dnaapler is a simple python program that takes a single nucleotide input sequence (in FASTA format), finds the desired start gene using blastx against an amino acid sequence database, checks that the start codon of this gene is found, and if so, then reorients the chromosome to begin with this gene on the forward strand.
Versions
0.1.0
Commands
dnaapler
Module
You can load the modules by:
module load biocontainers
module load dnaapler
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run dnaapler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dnaapler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dnaapler
Dnaio
Introduction
Dnaio
is a Python 3.7+ library for very efficient parsing and writing of FASTQ and also FASTA files.
Versions
0.8.1
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load dnaio
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Dnaio on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dnaio
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dnaio
python dnaio_test.py
Dragonflye
Introduction
Dragonflye is a pipeline that aims to make assembling Oxford Nanopore reads quick and easy.
Versions
1.0.13
1.0.14
Commands
dragonflye
Module
You can load the modules by:
module load biocontainers
module load dragonflye
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run dragonflye on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=dragonflye
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dragonflye
dragonflye --cpus 8 \
--outdir output \
--reads SRR18498195.fastq
Drep
Introduction
Drep
is a python program for rapidly comparing large numbers of genomes.
Versions
3.2.2
Commands
dRep
Module
You can load the modules by:
module load biocontainers
module load drep
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Drep on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=drep
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers drep
dRep compare compare_out -g tests/genomes/*
dRep dereplicate dereplicate_out -g tests/genomes/*
Dropest
Introduction
Dropest
is a pipeline for initial analysis of droplet-based single-cell RNA-seq data.
Versions
0.8.6
Commands
dropest
droptag
dropReport.Rsc
R
Rscript
Module
You can load the modules by:
module load biocontainers
module load dropest
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Dropest on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dropest
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dropest
dropest -f -c 10x.xml -C 1200 neurons_900_possorted_genome_bam.bam
Drop-seq
Introduction
Drop-seq are java tools for analyzing Drop-seq data.
Versions
2.5.2
Commands
AssignCellsToSamples
BamTagHistogram
BamTagOfTagCounts
BaseDistributionAtReadPosition
BipartiteRabiesVirusCollapse
CensusSeq
CollapseBarcodesInPlace
CollapseTagWithContext
CompareDropSeqAlignments
ComputeUMISharing
ConvertTagToReadGroup
ConvertToRefFlat
CountUnmatchedSampleIndices
CreateIntervalsFiles
CreateMetaCells
CreateSnpIntervalFromVcf
CsiAnalysis
DetectBeadSubstitutionErrors
DetectBeadSynthesisErrors
DetectDoublets
DigitalExpression
DownsampleBamByTag
DownsampleTranscriptsAndQuantiles
Drop-seq_Alignment_Cookbook.pdf
Drop-seq_alignment.sh
FilterBam
FilterBamByGeneFunction
FilterBamByTag
FilterDge
FilterGtf
FilterValidRabiesBarcodes
GatherGeneGCLength
GatherMolecularBarcodeDistributionByGene
GatherReadQualityMetrics
GenotypeSperm
MaskReferenceSequence
MergeDgeSparse
PolyATrimmer
ReduceGtf
RollCall
SelectCellsByNumTranscripts
SignTest
SingleCellRnaSeqMetricsCollector
SpermSeqMarkDuplicates
SplitBamByCell
TagBam
TagBamWithReadSequenceExtended
TagReadWithGeneExonFunction
TagReadWithGeneFunction
TagReadWithInterval
TagReadWithRabiesBarcodes
TrimStartingSequence
ValidateAlignedSam
ValidateReference
create_Drop-seq_reference_metadata.sh
Module
You can load the modules by:
module load biocontainers
module load drop-seq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run drop-seq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=drop-seq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers drop-seq
Dsuite
Introduction
Dsuite
is a fast C++ implementation, allowing genome scale calculations of the D and f4-ratio statistics across all combinations of tens or hundreds of populations or species directly from a variant call format (VCF) file.
Versions
0.4.r43
0.5.r44
Commands
Dsuite
dtools.py
DtriosParallel
Module
You can load the modules by:
module load biocontainers
module load dsuite
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Dsuite on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=dsuite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dsuite
Dsuite Dtrios -c -n no_geneflow -t simulated_tree_no_geneflow.nwk chr1_no_geneflow.vcf.gz species_sets.txt
easySFS
Introduction
easySFS
is a tool for the effective selection of population size projection for construction of the site frequency spectrum.
Versions
1.0
Commands
easySFS.py
Module
You can load the modules by:
module load biocontainers
module load easysfs
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run easySFS on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=easysfs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers easysfs
easySFS.py -i example_files/wcs_1200.vcf -p example_files/wcs_pops.txt --preview -a
easySFS.py -i example_files/wcs_1200.vcf -p example_files/wcs_pops.txt -a --proj=7,7
Edta
Introduction
Edta
is is developed for automated whole-genome de-novo TE annotation and benchmarking the annotation performance of TE libraries.
- Note: Running EDTA, please use the command like this:
EDTA.pl [OPTIONS]
DO NOT call it ‘perl EDTA.pl’
Versions
1.9.6
2.0.0
Commands
EDTA.pl
EDTA_processI.pl
EDTA_raw.pl
FET.pl
bdf2gdfont.pl
buildRMLibFromEMBL.pl
buildSummary.pl
calcDivergenceFromAlign.pl
cd-hit-2d-para.pl
cd-hit-clstr_2_blm8.pl
cd-hit-div.pl
cd-hit-para.pl
check_result.pl
clstr2tree.pl
clstr2txt.pl
clstr2xml.pl
clstr_cut.pl
clstr_list.pl
clstr_list_sort.pl
clstr_merge.pl
clstr_merge_noorder.pl
clstr_quality_eval.pl
clstr_quality_eval_by_link.pl
clstr_reduce.pl
clstr_renumber.pl
clstr_rep.pl
clstr_reps_faa_rev.pl
clstr_rev.pl
clstr_select.pl
clstr_select_rep.pl
clstr_size_histogram.pl
clstr_size_stat.pl
clstr_sort_by.pl
clstr_sort_prot_by.pl
clstr_sql_tbl.pl
clstr_sql_tbl_sort.pl
convert_MGEScan3.0.pl
convert_ltr_struc.pl
convert_ltrdetector.pl
createRepeatLandscape.pl
down_tRNA.pl
dupliconToSVG.pl
filter_rt.pl
genome_plot.pl
genome_plot2.pl
genome_plot_svg.pl
getRepeatMaskerBatch.pl
legacy_blast.pl
lib-test.pl
make_multi_seq.pl
maskFile.pl
plot_2d.pl
plot_len1.pl
rmOut2Fasta.pl
rmOutToGFF3.pl
rmToUCSCTables.pl
update_blastdb.pl
viewMSA.pl
wublastToCrossmatch.pl
Module
You can load the modules by:
module load biocontainers
module load edta
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Edta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 10
#SBATCH --job-name=edta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers edta
EDTA.pl --genome genome.fa --cds genome.cds.fa --curatedlib EDTA/database/rice6.9.5.liban --exclude genome.exclude.bed --overwrite 1 --sensitive 1 --anno 1 --evaluate 1 --threads 10
Eggnog-mapper
Introduction
Eggnog-mapper
is a tool for fast functional annotation of novel sequences.
Versions
2.1.7
Commands
create_dbs.py
download_eggnog_data.py
emapper.py
hmm_mapper.py
hmm_server.py
hmm_worker.py
vba_extract.py
Module
You can load the modules by:
module load biocontainers
module load eggnog-mapper
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Eggnog-mapper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=eggnog-mapper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers eggnog-mapper
emapper.py -i proteins.faa --cpu 24 -o protein.out
emapper.py -m diamond --itype CDS -i cDNA.fasta -o cdna.out --cpu 24
Emboss
Introduction
Emboss
is “The European Molecular Biology Open Software Suite”.
Versions
6.6.0
Commands
aaindexextract
abiview
acdc
acdgalaxy
acdlog
acdpretty
acdtable
acdtrace
acdvalid
aligncopy
aligncopypair
antigenic
assemblyget
backtranambig
backtranseq
banana
biosed
btwisted
cachedas
cachedbfetch
cacheebeyesearch
cacheensembl
cai
chaos
charge
checktrans
chips
cirdna
codcmp
codcopy
coderet
compseq
cons
consambig
cpgplot
cpgreport
cusp
cutgextract
cutseq
dan
dbiblast
dbifasta
dbiflat
dbigcg
dbtell
dbxcompress
dbxedam
dbxfasta
dbxflat
dbxgcg
dbxobo
dbxreport
dbxresource
dbxstat
dbxtax
dbxuncompress
degapseq
density
descseq
diffseq
distmat
dotmatcher
dotpath
dottup
dreg
drfinddata
drfindformat
drfindid
drfindresource
drget
drtext
edamdef
edamhasinput
edamhasoutput
edamisformat
edamisid
edamname
edialign
einverted
embossdata
embossupdate
embossversion
emma
emowse
entret
epestfind
eprimer3
eprimer32
equicktandem
est2genome
etandem
extractalign
extractfeat
extractseq
featcopy
featmerge
featreport
feattext
findkm
freak
fuzznuc
fuzzpro
fuzztran
garnier
geecee
getorf
godef
goname
helixturnhelix
hmoment
iep
infoalign
infoassembly
infobase
inforesidue
infoseq
isochore
jaspextract
jaspscan
jembossctl
lindna
listor
makenucseq
makeprotseq
marscan
maskambignuc
maskambigprot
maskfeat
maskseq
matcher
megamerger
merger
msbar
mwcontam
mwfilter
needle
needleall
newcpgreport
newcpgseek
newseq
nohtml
noreturn
nospace
notab
notseq
nthseq
nthseqset
octanol
oddcomp
ontocount
ontoget
ontogetcommon
ontogetdown
ontogetobsolete
ontogetroot
ontogetsibs
ontogetup
ontoisobsolete
ontotext
palindrome
pasteseq
patmatdb
patmatmotifs
pepcoil
pepdigest
pepinfo
pepnet
pepstats
pepwheel
pepwindow
pepwindowall
plotcon
plotorf
polydot
preg
prettyplot
prettyseq
primersearch
printsextract
profit
prophecy
prophet
prosextract
pscan
psiphi
rebaseextract
recoder
redata
refseqget
remap
restover
restrict
revseq
runJemboss.sh
seealso
seqcount
seqmatchall
seqret
seqretsetall
seqretsplit
seqxref
seqxrefget
servertell
showalign
showdb
showfeat
showorf
showpep
showseq
showserver
shuffleseq
sigcleave
silent
sirna
sixpack
sizeseq
skipredundant
skipseq
splitsource
splitter
stretcher
stssearch
supermatcher
syco
taxget
taxgetdown
taxgetrank
taxgetspecies
taxgetup
tcode
textget
textsearch
tfextract
tfm
tfscan
tmap
tranalign
transeq
trimest
trimseq
trimspace
twofeat
union
urlget
variationget
vectorstrip
water
whichdb
wobble
wordcount
wordfinder
wordmatch
wossdata
wossinput
wossname
wossoperation
wossoutput
wossparam
wosstopic
xmlget
xmltext
yank
Module
You can load the modules by:
module load biocontainers
module load emboss
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Emboss on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=emboss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers emboss
Ensembl-vep
Introduction
Ensembl-vep(Ensembl Variant Effect Predictor) predicts the functional effects of genomic variants.
Versions
106.1
107.0
108.2
Commands
vep
haplo
variant_recoder
Module
You can load the modules by:
module load biocontainers
module load ensembl-vep
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ensembl-vep on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ensembl-vep
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ensembl-vep
haplo -i bos_taurus_UMD3.1.vcf -o out.txt
Epic2
Introduction
Epic2
is an ultraperformant Chip-Seq broad domain finder based on SICER.
Versions
0.0.51
0.0.52
Commands
epic2
epic2-bw
epic2-df
Module
You can load the modules by:
module load biocontainers
module load epic2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Epic2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=epic2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers epic2
epic2 -t /examples/test.bed.gz \
-c /examples/control.bed.gz \
> deleteme.txt
Evidencemodeler
Introduction
Evidencemodeler
is a software combines ab intio gene predictions and protein and transcript alignments into weighted consensus gene structures.
Versions
1.1.1
Commands
evidence_modeler.pl
BPbtab.pl
EVMLite.pl
EVM_to_GFF3.pl
convert_EVM_outputs_to_GFF3.pl
create_weights_file.pl
execute_EVM_commands.pl
extract_complete_proteins.pl
gff3_file_to_proteins.pl
gff3_gene_prediction_file_validator.pl
gff_range_retriever.pl
partition_EVM_inputs.pl
recombine_EVM_partial_outputs.pl
summarize_btab_tophits.pl
write_EVM_commands.pl
Module
You can load the modules by:
module load biocontainers
module load evidencemodeler
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Evidencemodeler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=evidencemodeler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers evidencemodeler
evidence_modeler.pl --genome genome.fasta \
--weights weights.txt \
--gene_predictions gene_predictions.gff3 \
--protein_alignments protein_alignments.gff3 \
--transcript_alignments transcript_alignments.gff3 \
> evm.out
Exonerate
Introduction
Exonerate
is a generic tool for pairwise sequence comparison/alignment.
Versions
2.4.0
Commands
exonerate
Module
You can load the modules by:
module load biocontainers
module load exonerate
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Exonerate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=exonerate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers exonerate
exonerate -m genome2genome cms.fasta cmm.fasta > cm_vs_cs.out
Expansionhunter
Introduction
Expansion Hunter: a tool for estimating repeat sizes.
Versions
4.0.2
Commands
ExpansionHunter
Module
You can load the modules by:
module load biocontainers
module load expansionhunter
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run expansionhunter on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=expansionhunter
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers expansionhunter
Fasta3
Introduction
Fasta3
is a suite of programs for searching nucleotide or protein databases with a query sequence.
Versions
36.3.8
Commands
fasta36
fastf36
fastm36
fasts36
fastx36
fasty36
ggsearch36
glsearch36
lalign36
ssearch36
tfastf36
tfastm36
tfasts36
tfastx36
tfasty36
Module
You can load the modules by:
module load biocontainers
module load fasta3
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fasta3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fasta3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fasta3
fasta36 input.fasta genome.fasta
FastANI
Introduction
FastANI
is developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI).
Versions
1.32
1.33
Commands
fastANI
Module
You can load the modules by:
module load biocontainers
module load fastani
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run FastANI on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastani
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastani
fastANI -q cmm.fasta -r cms.fasta -o cm_cs_out
fastANI -q cmm.fasta -r cms.fasta --visualize -o cm_cs_visualize_out
Fastp
Introduction
Fastp
is an ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging, etc).
Versions
0.20.1
0.23.2
Commands
fastp
Module
You can load the modules by:
module load biocontainers
module load fastp
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fastp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastp
fastp -i input_1.fastq -I input_2.fastq -o out.R1.fq.gz -O out.R2.fq.gz
FastQC
Introduction
FastQC
aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.
Versions
0.11.9
Commands
fastqc
Module
You can load the modules by:
module load biocontainers
module load fastqc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fastqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=fastqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastqc
fastqc -o fastqc_out -t 4 FASTQ1 FASTQ2
Fastq_pair
Introduction
Fastq_pair
is used to match up paired end fastq files quickly and efficiently.
Versions
1.0
Commands
fastq_pair
Module
You can load the modules by:
module load biocontainers
module load fastq_pair
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fastq_pair on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastq_pair
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastq_pair
fastq_pair seq_1.fastq seq_2.fastq
Fastq-scan
Introduction
Fastq-scan reads a FASTQ from STDIN and outputs summary statistics (read lengths, per-read qualities, per-base qualities) in JSON format.
Versions
1.0.0
Commands
fastq-scan
Module
You can load the modules by:
module load biocontainers
module load fastq-scan
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fastq-scan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastq-scan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastq-scan
cat example-q33.fq | fastq-scan -g 150000
Fastspar
Introduction
Fastspar
is a tool for rapid and scalable correlation estimation for compositional data.
Versions
1.0.0
Commands
fastspar
fastspar_bootstrap
fastspar_pvalues
fastspar_reduce
Module
You can load the modules by:
module load biocontainers
module load fastspar
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fastspar on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastspar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastspar
fastStructure
Introduction
fastStructure
is an algorithm for inferring population structure from large SNP genotype data. It is based on a variational Bayesian framework for posterior inference and is written in Python2.x.
Note: programs “structure.py”, “chooseK.py” and “distruct.py” are standalone executable and should be called by name directly (“structure.py”, etc). DO NOT invoke them as “python structure.py”, or as “python /usr/local/bin/structure.py”, this will not work!
Note: This containers lacks X11 libraries, so GUI plots with ‘distruct.py’ do not work. Instead, we need to tell the underlying Matplotlib to use a non-interactive plotting backend (to file). The easiest and most flexible way is to use the MPLBACKEND environment variable: env MPLBACKEND=”svg” distruct.py –output myplot.svg …….
- Available backends in this container:
Backend Filetypes Description agg png raster graphics – high quality PNG output ps ps eps vector graphics – Postscript output pdf pdf vector graphics – Portable Document Format svg svg vector graphics – Scalable Vector Graphics
Default MPLBACKEND=”agg” (for PNG format output).
Versions
1.0-py27
Commands
structure.py
chooseK.py
distruct.py
Module
You can load the modules by:
module load biocontainers
module load faststructure
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fastStructure on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=faststructure
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers faststructure
FastTree
Introduction
FastTree
infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory.
Detailed usage can be found here: http://www.microbesonline.org/fasttree/
Versions
2.1.10
2.1.11
Commands
fasttree
FastTree
FastTreeMP
Note
fasttree
and FastTree
are the same program, and they only support one CPU. If you want to use multiple CPUs, please use FastTreeMP
and also set the OMP_NUM_THREADS
to the number of cores you requested.
Module
You can load the modules by:
module load biocontainers
module load fasttree
Example job using single CPU
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run FastTree on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fasttree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fasttree
FastTree alignmentfile > treefile
Example job using multiple CPUs
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run FastTree on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=FastTreeMP
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fasttree
export OMP_NUM_THREADS=24
FastTreeMP alignmentfile > treefile
FASTX-Toolkit
Introduction
FASTX-Toolkit
is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
Versions
0.0.14
Commands
fasta_clipping_histogram.pl
fasta_formatter
fasta_nucleotide_changer
fastq_masker
fastq_quality_boxplot_graph.sh
fastq_quality_converter
fastq_quality_filter
fastq_quality_trimmer
fastq_to_fasta
fastx_artifacts_filter
fastx_barcode_splitter.pl
fastx_clipper
fastx_collapser
fastx_nucleotide_distribution_graph.sh
fastx_nucleotide_distribution_line_graph.sh
fastx_quality_stats
fastx_renamer
fastx_reverse_complement
fastx_trimmer
fastx_uncollapser
Module
You can load the modules by:
module load biocontainers
module load fastx_toolkit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run FASTX-Toolkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastx_toolkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastx_toolkit
Filtlong
Introduction
Filtlong
is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter.
Versions
0.2.1
Commands
filtlong
Module
You can load the modules by:
module load biocontainers
module load filtlong
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Filtlong on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=filtlong
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers filtlong
Flye
Introduction
Flye
: Fast and accurate de novo assembler for single molecule sequencing reads.
Versions
2.9.1
2.9.2
2.9
Commands
flye
Module
You can load the modules by:
module load biocontainers
module load flye
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Flye on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=flye
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers flye
flye --pacbio-raw E.coli_PacBio_40x.fasta --out-dir out_pacbio --threads 12
flye --nano-raw Loman_E.coli_MAP006-1_2D_50x.fasta --out-dir out_nano --threads 12
Fq
Introduction
Fq is a command line utility for manipulating Illumina-generated FastQ files.
Versions
0.10.0
Commands
fq
Module
You can load the modules by:
module load biocontainers
module load fq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fq
Fraggenescan
Introduction
Fraggenescan
is an application for finding (fragmented) genes in short reads. It can also be applied to predict prokaryotic genes in incomplete assemblies or complete genomes.
Versions
1.31
Commands
FragGeneScan
run_FragGeneScan.pl
Module
You can load the modules by:
module load biocontainers
module load fraggenescan
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fraggenescan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fraggenescan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fraggenescan
FragGeneScanRs -t 454_10 < example/NC_000913-454.fna > example/NC_000913-454.faa
Fraggenescanrs
Introduction
FragGeneScanRs is a better and faster Rust implementation of the FragGeneScan gene prediction model for short and error-prone reads. Its command line interface is backward compatible and adds extra features for more flexible usage. Compared to the original C implementation, shotgun metagenomic reads are processed up to 22 times faster using a single thread, with better scaling for multithreaded execution.
Versions
1.1.0
Commands
FragGeneScanRs
Module
You can load the modules by:
module load biocontainers
module load fraggenescanrs
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fraggenescanrs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fraggenescanrs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fraggenescanrs
Freebayes
Introduction
Freebayes
is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.
Versions
1.3.5
1.3.6
Commands
freebayes
freebayes-parallel
Module
You can load the modules by:
module load biocontainers
module load freebayes
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Freebayes on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=freebayes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers freebayes
freebayes -f ref.fa aln.cram >var.vcf
Freyja
Introduction
Freyja is a tool to recover relative lineage abundances from mixed SARS-CoV-2 samples from a sequencing dataset (BAM aligned to the Hu-1 reference). The method uses lineage-determining mutational “barcodes” derived from the UShER global phylogenetic tree as a basis set to solve the constrained (unit sum, non-negative) de-mixing problem.
Versions
1.3.11
1.4.2
Commands
freyja
Module
You can load the modules by:
module load biocontainers
module load freyja
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run freyja on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=freyja
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers freyja
Fseq
Introduction
Fseq
is a feature density estimator for high-throughput sequence tags.
Versions
2.0.3
Commands
fseq2
Module
You can load the modules by:
module load biocontainers
module load fseq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fseq
Ftp
Introduction
A File Transfer Protocol client (FTP client) is a software utility that establishes a connection between a host computer and a remote server, typically an FTP server.
Versions
0.17
Commands
ftp
Module
You can load the modules by:
module load biocontainers
module load ftp
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ftp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ftp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ftp
Funannotate
Introduction
Funannotate
is a genome prediction, annotation, and comparison software package.
Versions
1.8.10
1.8.13
Commands
funannotate
Module
You can load the modules by:
module load biocontainers
module load funannotate
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Funannotate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=funannotate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers funannotate
funannotate clean -i genome.fa -o genome_cleaned.fa
funannotate sort -i genome_cleaned.fa -o genome_cleaned_sorted.fa
funannotate predict -i genome_cleaned_sorted.fa -o predict_out --species "arabidopsis" --rna_bam RNAseq.bam --cpus 12
Fwdpy11
Introduction
Fwdpy11 is a Python package for forward-time population genetic simulation.
Versions
0.18.1
Commands
python3
python
Module
You can load the modules by:
module load biocontainers
module load fwdpy11
Interactive job
To run fwdpy11 interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers fwdpy11
(base) UserID@bell-a008:~ $ python
Python 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fwdpy11
>>> pop = fwdpy11.DiploidPopulation(100, 1000.0)
>>> print(f"N = {pop.N}, L = {pop.tables.genome_length}")
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fwdpy11 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fwdpy11
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fwdpy11
python script.py
Gadma
Introduction
GADMA is a command-line tool. Basic pipeline presents a series of launches of the genetic algorithm folowed by local search optimization and infers demographic history from the Allele Frequency Spectrum of multiple populations (up to three).
Versions
2.0.0rc21
Commands
gadma
python
python3
Module
You can load the modules by:
module load biocontainers
module load gadma
Interactive job
To run GADMA interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers gadma
(base) UserID@bell-a008:~ $ python
Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10)
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from gadma import *
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gadma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gadma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gadma
gadma -p params_file
Gambit
Introduction
GAMBIT (Genomic Approximation Method for Bacterial Identification and Tracking) is a tool for rapid taxonomic identification of microbial pathogens.
Versions
0.5.0
Commands
gambit
Module
You can load the modules by:
module load biocontainers
module load gambit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gambit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gambit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gambit
gambit -d database query -o results.csv *.fasta
Gamma
Introduction
GAMMA (Gene Allele Mutation Microbial Assessment) is a command line tool that finds gene matches in microbial genomic data using protein coding (rather than nucleotide) identity, and then translates and annotates the match by providing the type (i.e., mutant, truncation, etc.) and a translated description (i.e., Y190S mutant, truncation at residue 110, etc.). Because microbial gene families often have multiple alleles and existing databases are rarely exhaustive, GAMMA is helpful in both identifying and explaining how unique alleles differ from their closest known matches.
Versions
1.4
2.2
Commands
GAMMA-S.py
GAMMA.py
Module
You can load the modules by:
module load biocontainers
module load gamma
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gamma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gamma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gamma
GAMMA.py DHQP1701672_complete_genome.fasta ResFinderDB_Combined_05-06-20.fsa GAMMA_Test
Gangstr
Introduction
GangSTR is a tool for genome-wide profiling tandem repeats from short reads. A key advantage of GangSTR over existing genome-wide TR tools (e.g. lobSTR or hipSTR) is that it can handle repeats that are longer than the read length. GangSTR takes aligned reads (BAM) and a set of repeats in the reference genome as input and outputs a VCF file containing genotypes for each locus.
Versions
2.5.0
Commands
GangSTR
Module
You can load the modules by:
module load biocontainers
module load gangstr
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gangstr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gangstr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gangstr
Gapfiller
Introduction
GapFiller is a seed-and-extend local assembler to fill the gap within paired reads. It can be used for both DNA and RNA and it has been tested on Illumina data. GapFiller can be used whenever a sequence is to be assembled starting from reads lying on its ends, provided a loose estimate of sequence length.
Versions
2.1.2
Commands
GapFiller
Module
You can load the modules by:
module load biocontainers
module load gapfiller
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gapfiller on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gapfiller
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gapfiller
Gapit
Introduction
GAPIT is a Genome Association and Prediction Integrated Tool.
Versions
3.3
Commands
R
Rscript
Module
You can load the modules by:
module load biocontainers
module load gapit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gapit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gapit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gapit
GATK
Introduction
GATK
(Genome Analysis Toolkit) is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery.
Versions
3.8
Commands
gatk3
Module
You can load the modules by:
module load biocontainers
module load gatk
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run GATK on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=gatk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gatk
gatk3 -T HaplotypeCaller \
-nct 24 -R hg38.fa \
-I 19P0126636WES.sorted.bam \
-o 19P0126636WES.HC.vcf
GATK4
Introduction
GATK (Genome Analysis Toolkit)
is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery. Detailed usage can be found here: https://www.broadinstitute.org/gatk/.
Versions
4.2.0
4.2.5.0
4.2.6.1
4.3.0.0
Commands
gatk
Module
You can load the modules by:
module load biocontainers
module load gatk4/4.2.5.0
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gatk4 our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=gatk4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gatk4/4.2.5.0
gatk --java-options "-Xmx12G -XX:ParallelGCThreads=24" HaplotypeCaller -R hg38.fa -I 19P0126636WES.sorted.bam -O 19P0126636WES.HC.vcf --sample-name 19P0126636
Gemma
Introduction
Gemma
is a software toolkit for fast application of linear mixed models (LMMs) and related models to genome-wide association studies (GWAS) and other large-scale data sets.
Versions
0.98.3
Commands
gemma
Module
You can load the modules by:
module load biocontainers
module load gemma
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gemma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gemma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gemma
gemma -g ./example/mouse_hs1940.geno.txt.gz -p ./example/mouse_hs1940.pheno.txt \
-gk -o mouse_hs1940
gemma -g ./example/mouse_hs1940.geno.txt.gz \
-p ./example/mouse_hs1940.pheno.txt -n 1 -a ./example/mouse_hs1940.anno.txt \
-k ./output/mouse_hs1940.cXX.txt -lmm -o mouse_hs1940_CD8_lmm
Gemoma
Introduction
Gene Model Mapper (GeMoMa) is a homology-based gene prediction program. GeMoMa uses the annotation of protein-coding genes in a reference genome to infer the annotation of protein-coding genes in a target genome. Thereby, GeMoMa utilizes amino acid sequence and intron position conservation. In addition, GeMoMa allows to incorporate RNA-seq evidence for splice site prediction.
Versions
1.7.1
Commands
GeMoMa
Module
You can load the modules by:
module load biocontainers
module load gemoma
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gemoma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gemoma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gemoma
GeneMark-ES/ET/EP
Introduction
GeneMark-ES/ET/EP
contains GeneMark-ES, GeneMark-ET and GeneMark-EP+ algorithms.
Versions
4.68
4.69
Commands
bed_to_gff.pl
bp_seq_select.pl
build_mod.pl
calc_introns_from_gtf.pl
change_path_in_perl_scripts.pl
compare_intervals_exact.pl
gc_distr.pl
get_below_gc.pl
get_sequence_from_GTF.pl
gmes_petap.pl
hc_exons2hints.pl
histogram.pl
make_nt_freq_mat.pl
parse_ET.pl
parse_by_introns.pl
parse_gibbs.pl
parse_set.pl
predict_genes.pl
reformat_gff.pl
rescale_gff.pl
rnaseq_introns_to_gff.pl
run_es.pl
run_hmm_pbs.pl
scan_for_bp.pl
star_to_gff.pl
verify_evidence_gmhmm.pl
Academic license
To use GeneMark, users need to download license files by yourself.
Go to the GeneMark web site: http://exon.gatech.edu/GeneMark/license_download.cgi. Check the boxes for GeneMark-ES/ET/EP ver 4.69_lic
and LINUX 64
next to it, fill out the form, then click “I agree”. In the next page, right click and copy the link addresses for 64 bit
licenss. Paste the link addresses in the commands below:
cd $HOME
wget "replace with license URL"
zcat gm_key_64.gz > .gm_key
Module
You can load the modules by:
module load biocontainers
module load genemark/4.68
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run GeneMark on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=genemark
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genemark/4.68
gmes_petap.pl --ES --cores 24 --sequence scaffolds.fasta
Genemarks-2
Introduction
GeneMarkS-2 combines GeneMark.hmm (prokaryotic) and GeneMark (prokaryotic) with a self-training procedure that determines parameters of the models of both GeneMark.hmm and GeneMark.
The users need to download your own licence key from GeneMark website and copy key “gm_key” into users’ home directory as: cp gm_key ~/.gm_key | Home page: http://opal.biology.gatech.edu/GeneMark/
Versions
1.14_1.25
Commands
gms2.pl
biogem
compp
gmhmmp2
Module
You can load the modules by:
module load biocontainers
module load genemarks-2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run genemarks-2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genemarks-2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genemarks-2
Genmap
Introduction
GenMap: Ultra-fast Computation of Genome Mappability.
Versions
1.3.0
Commands
genmap
Module
You can load the modules by:
module load biocontainers
module load genmap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run genmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genmap
export TMPDIR=$PWD/tmp
genmap index -F ~/.local/share/genomes/hg38/hg38.fa -I hg38_index
genmap map -K 64 -E 2 -I hg38_index -O map_output_hg38 -t -w -bg
Genomedata
Introduction
Genomedata is a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint.
Versions
1.5.0
Commands
python
python3
genomeCoverageBed
genomedata-close-data
genomedata-erase-data
genomedata-hardmask
genomedata-histogram
genomedata-info
genomedata-load
genomedata-load-assembly
genomedata-load-data
genomedata-load-seq
genomedata-open-data
genomedata-query
genomedata-report
Module
You can load the modules by:
module load biocontainers
module load genomedata
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run genomedata on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomedata
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genomedata
Genomepy
Introduction
Genomepy
is designed to provide a simple and straightforward way to download and use genomic data.
Versions
0.12.0
0.14.0
Commands
genomepy
Module
You can load the modules by:
module load biocontainers
module load genomepy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Genomepy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomepy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genomepy
Genomescope2
Introduction
Genomescope2
: Reference-free profiling of polyploid genomes.
Versions
2.0
Commands
genomescope2
Module
You can load the modules by:
module load biocontainers
module load genomescope2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Genomescope2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomescope2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genomescope2
wget https://raw.githubusercontent.com/schatzlab/genomescope/master/analysis/real_data/ara_F1_21.hist
genomescope2 -i ara_F1_21.hist -o output -k 21
Genomicconsensus
Introduction
Genomicconsensus
is the current PacBio consensus and variant calling suite.
Versions
2.3.3
Commands
quiver
arrow
variantCaller
Module
You can load the modules by:
module load biocontainers
module load genomicconsensus
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Genomicconsensus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomicconsensus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genomicconsensus
quiver -j12 out.aligned_subreads.bam \
-r All4mer.V2.01_Insert-changed.fa \
-o consensus.fasta -o consensus.fastq
Genrich
Introduction
Genrich
is a peak-caller for genomic enrichment assays (e.g. ChIP-seq, ATAC-seq). It analyzes alignment files generated following the assay and produces a file detailing peaks of significant enrichment.
Versions
0.6.1
Commands
Genrich
Module
You can load the modules by:
module load biocontainers
module load genrich
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Genrich on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genrich
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genrich
Genrich -t sample.bam -o sample.narrowPeak -v
Getorganelle
Introduction
GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes.
Versions
1.7.7.0
Commands
get_organelle_config.py
get_organelle_from_assembly.py
get_organelle_from_reads.py
slim_graph.py
summary_get_organelle_output.py
Module
You can load the modules by:
module load biocontainers
module load getorganelle
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run getorganelle on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=getorganelle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers getorganelle
Gfaffix
Introduction
GFAffix identifies walk-preserving shared affixes in variation graphs and collapses them into a non-redundant graph structure.
Versions
0.1.4
Commands
gfaffix
Module
You can load the modules by:
module load biocontainers
module load gfaffix
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gfaffix on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gfaffix
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gfaffix
Gfastats
Introduction
gfastats is a single fast and exhaustive tool for summary statistics and simultaneous fa (fasta, fastq, gfa [.gz]) genome assembly file manipulation. gfastats also allows seamless fasta<>fastq<>gfa[.gz] conversion. It has been tested in genomes even >100Gbp.
Versions
1.2.3
1.3.6
Commands
gfastats
Module
You can load the modules by:
module load biocontainers
module load gfastats
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gfastats on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gfastats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gfastats
gfastats input.fasta -o gfa
Gfatools
Introduction
gfatools is a set of tools for manipulating sequence graphs in the GFA or the rGFA format. It has implemented parsing, subgraph and conversion to FASTA/BED.
Versions
0.5
Commands
gfatools
Module
You can load the modules by:
module load biocontainers
module load gfatools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gfatools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gfatools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gfatools
# Extract a subgraph
gfatools view -l MTh4502 -r 1 test/MT.gfa > sub.gfa
# Convert GFA to segment FASTA
gfatools gfa2fa test/MT.gfa > MT-seg.fa
# Convert rGFA to stable FASTA or BED
gfatools gfa2fa -s test/MT.gfa > MT.fa
gfatools gfa2bed -m test/MT.gfa > MT.bed
Gffcompare
Introduction
Gffcompare
is used to compare, merge, annotate and estimate accuracy of one or more GFF files.
Versions
0.11.2
Commands
gffcompare
Module
You can load the modules by:
module load biocontainers
module load gffcompare
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gffcompare on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gffcompare
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gffcompare
gffcompare -r annotation.gff transcripts.gtf
Gffread
Introduction
Gffread
is used to validate, filter, convert and perform various other operations on GFF files.
Versions
0.12.7
Commands
gffread
Module
You can load the modules by:
module load biocontainers
module load gffread
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gffread on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gffread
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gffread
gffread -E annotation.gff -o ann_simple.gff
gffread annotation.gff -T -o annotation.gtf
gffread -w transcripts.fa -g genome.fa annotation.gff
Gffutils
Introduction
gffutils is a Python package for working with and manipulating the GFF and GTF format files typically used for genomic annotations.
Versions
0.11.1
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load gffutils
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gffutils on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gffutils
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gffutils
Gimmemotifs
Introduction
GimmeMotifs is a suite of motif tools, including a motif prediction pipeline for ChIP-seq experiments.
Versions
0.17.1
Commands
gimme
Module
You can load the modules by:
module load biocontainers
module load gimmemotifs
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gimmemotifs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gimmemotifs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gimmemotifs
gimme motifs ENCFF407IVS.bed ENCFF407IVS_motifs \
-g ~/.local/share/genomes/hg38/hg38.fa --denovo
Glimmer
Introduction
Glimmer
is a system for finding genes in microbial DNA, especially the genomes of bacteria, archaea, and viruses.
Versions
3.02
Commands
anomaly
build-fixed
build-icm
entropy-profile
entropy-score
extract
g3-from-scratch.csh
g3-from-training.csh
g3-iterated.csh
get-motif-counts.awk
glim-diff.awk
glimmer3
long-orfs
match-list-col.awk
multi-extract
not-acgt.awk
score-fixed
start-codon-distrib
test
uncovered
upstream-coords.awk
window-acgt
Module
You can load the modules by:
module load biocontainers
module load glimmer
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Glimmer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=glimmer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers glimmer
long-orfs -n -t 1.15 scaffolds.fasta run1.longorfs
extract -t scaffolds.fasta run1.longorfs > run1.train
build-icm -r run1.icm < run1.train
glimmer3 scaffolds.fasta run1.icm cm
Glimmerhmm
Introduction
Glimmerhmm
is a new gene finder based on a Generalized Hidden Markov Model (GHMM).
Versions
3.0.4
Commands
glimmerhmm
glimmhmm.pl
trainGlimmerHMM
Module
You can load the modules by:
module load biocontainers
module load glimmerhmm
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Glimmerhmm on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=glimmerhmm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers glimmerhmm
trainGlimmerHMM Asperg.fasta Asperg.cds -d Asperg
glimmerhmm Asperg.fasta -d Asperg -o Asperg_glimmerhmm_out
Glnexus
Introduction
Glnexus: Scalable gVCF merging and joint variant calling for population sequencing projects.
Versions
1.4.1
Commands
glnexus_cli
Module
You can load the modules by:
module load biocontainers
module load glnexus
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run glnexus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=glnexus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers glnexus
glnexus_cli --config DeepVariant \
--bed ALDH2.bed \
dv_1000G_ALDH2_gvcf/*.g.vcf.gz \
> dv_1000G_ALDH2.bcf
Gmap
Introduction
Gmap
is a genomic mapping and alignment program for mRNA and EST sequences.
Versions
2021.05.27
2021.08.25
Commands
atoiindex
cmetindex
cpuid
dbsnp_iit
ensembl_genes
fa_coords
get-genome
gff3_genes
gff3_introns
gff3_splicesites
gmap
gmap.avx2
gmap_build
gmap_cat
gmapindex
gmapl
gmapl.avx2
gmapl.nosimd
gmap.nosimd
gmap_process
gsnap
gsnap.avx2
gsnapl
gsnapl.avx2
gsnapl.nosimd
gsnap.nosimd
gtf_genes
gtf_introns
gtf_splicesites
gtf_transcript_splicesites
gvf_iit
iit_dump
iit_get
iit_store
indexdb_cat
md_coords
psl_genes
psl_introns
psl_splicesites
sam_sort
snpindex
trindex
vcf_iit
Module
You can load the modules by:
module load biocontainers
module load gmap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=gmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gmap
gmap_build -d Cmm -D Cmm genome.fasta
gmap -d Cmm -t 4 -D ./Cmm cdna.fasta > gmap_out.txt
gmap_build -d GRCh38 -D GRCh38 Homo_sapiens.GRCh38.dna.primary_assembly.fa
gsnap -d GRCh38 -D ./GRCh38 --nthreads=4 SRR16956239_1.fastq SRR16956239_2.fastq > gsnap_out.txt
goatools
Introduction
Goatools
is a python library for gene ontology analyses. Detailed information about its usage can be found here: https://github.com/tanghaibao/goatools
Versions
1.1.12
1.2.3
Commands
python
python3
compare_gos.py
fetch_associations.py
find_enrichment.py
go_plot.py
map_to_slim.py
ncbi_gene_results_to_python.py
plot_go_term.py
prt_terms.py
runxlrd.py
vba_extract.py
wr_hier.py
wr_sections.py
Module
You can load the modules by:
module load biocontainers
module load goatools/1.1.12
Interactive job
To run goatools interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers goatools/1.1.12
(base) UserID@bell-a008:~ $ python
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from goatools.base import download_go_basic_obo
>>> obo_fname = download_go_basic_obo()
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=goatools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers goatools/1.1.12
python script.py
find_enrichment.py --pval=0.05 --indent data/study data/population data/association
go_plot.py --go_file=tests/data/go_plot/go_heartjogging6.txt -r -o heartjogging6_r1.png
Graphlan
Introduction
Graphlan
is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees.
Versions
1.1.3
Commands
graphlan.py
graphlan_annotate.py
Module
You can load the modules by:
module load biocontainers
module load graphlan
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Graphlan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=graphlan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers graphlan
graphlan_annotate.py hmptree.xml hmptree.annot.xml --annot annot.txt
graphlan.py hmptree.annot.xml hmptree.png --dpi 150 --size 14
Graphmap
Introduction
Graphmap
is a novel mapper targeted at aligning long, error-prone third-generation sequencing data.
Versions
0.6.3
Commands
graphmap2
Module
You can load the modules by:
module load biocontainers
module load graphmap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Graphmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=graphmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers graphmap
Gridss
Introduction
Gridss
is a module software suite containing tools useful for the detection of genomic rearrangements.
Versions
2.13.2
Commands
R
Rscript
gridss
gridss_annotate_vcf_kraken2
gridss_annotate_vcf_repeatmasker
gridss_extract_overlapping_fragments
gridss_somatic_filter
gridsstools
virusbreakend
virusbreakend-build
Module
You can load the modules by:
module load biocontainers
module load gridss
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gridss on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gridss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gridss
Gseapy
Introduction
Gseapy
is a python wrapper for GESA and Enrichr.
Versions
0.10.8
Commands
gseapy
python
python3
Module
You can load the modules by:
module load biocontainers
module load gseapy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gseapy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gseapy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gseapy
gseapy ssgsea -d ./data/testSet_rand1200.gct \
-g data/temp.gmt \
-o test/ssgsea_report2 \
-p 4 --no-plot --no-scale
gseapy replot -i data -o test/replot_test
GTDB-Tk
Introduction
GTDB-Tk
is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB. It is designed to work with recent advances that allow hundreds or thousands of metagenome-assembled genomes (MAGs) to be obtained directly from environmental samples. It can also be applied to isolate and single-cell genomes.
GTDB-Tk reference data (R202) has been downloaded for users.
Versions
1.7.0
2.1.0
Commands
gtdbtk
Module
module load biocontainers module load gtdbtk/1.7.0
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run GTDB-Tk our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=gtdbtk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gtdbtk/1.7.0
gtdbtk identify --genome_dir genomes --out_dir identify --extension gz --cpus 8
gtdbtk align --identify_dir identify --out_dir align --cpus 8
gtdbtk classify --genome_dir genomes --align_dir align --out_dir classify --extension gz --cpus 8
Gubbins
Introduction
Gubbins
is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions.
Versions
3.2.0
3.3
Commands
extract_gubbins_clade.py
generate_ska_alignment.py
gubbins_alignment_checker.py
mask_gubbins_aln.py
run_gubbins.py
sumlabels.py
sumtrees.py
Module
You can load the modules by:
module load biocontainers
module load gubbins
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gubbins on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gubbins
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gubbins
run_gubbins.py --prefix ST239 ST239.aln
Guppy
Introduction
Guppy
is a data processing toolkit that contains the Oxford Nanopore Technologies’ basecalling algorithms, and several bioinformatic post-processing features.
Versions
6.0.1
6.5.7
Commands
guppy_aligner
guppy_barcoder
guppy_basecall_server
guppy_basecaller
guppy_basecaller_duplex
guppy_basecaller_supervisor
guppy_basecall_client
Module
You can load the modules by:
module load biocontainers
module load guppy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Guppy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=guppy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers guppy
guppy_basecaller --compress_fastq -i data/fast5_tiny/ \
-s basecall_tiny/ --cpu_threads_per_caller 12 \
--num_callers 1 -c dna_r9.4.1_450bps_hac.cfg
Hail
Introduction
Hail is an open-source, general-purpose, Python-based data analysis tool with additional data types and methods for working with genomic data.
Versions
0.2.94
0.2.98
Commands
python3
Module
You can load the modules by:
module load biocontainers
module load hail
Interactive job
To run Hail interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers hail
(base) UserID@bell-a008:~ $ python3
Python 3.7.13 (default, Apr 24 2022, 01:05:22)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hail as hl
>>> print(hl.citation())
Hail Team. Hail 0.2.94-f0b38d6c436f. https://github.com/hail-is/hail/commit/f0b38d6c436f.
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run hail on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hail
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hail
python3 script.py
Hap.py
Introduction
Hap.py is a tool to compare diploid genotypes at haplotype level.
Versions
0.3.9
Commands
bamstats.py
cnx.py
ftx.py
guess-ploidy.py
hap.py
ovc.py
plot-roh.py
pre.py
qfy.py
som.py
varfilter.py
Module
You can load the modules by:
module load biocontainers
module load hap.py
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run hap.py on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hap.py
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hap.py
hap.py \
example/happy/PG_NA12878_chr21.vcf.gz \
example/happy/NA12878_chr21.vcf.gz \
-f example/happy/PG_Conf_chr21.bed.gz \
-r example/chr21.fa \
-o test
Helen
Introduction
HELEN is a multi-task RNN polisher which operates on images produced by MarginPolish.
Versions
1.0
Commands
helen
Module
You can load the modules by:
module load biocontainers
module load helen
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run helen on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 32
#SBATCH --job-name=helen
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers helen
helen polish \
--image_dir mp_output \
--model_path "helen_modles/HELEN_r941_guppy344_microbial.pkl" \
--threads 32 \
--output_dir "helen_output/" \
--output_prefix Staph_Aur_draft_helen
Hicexplorer
Introduction
Hicexplorer
is a set of tools to process, normalize and visualize Hi-C data.
Versions
3.7.2
Commands
chicAggregateStatistic
chicDifferentialTest
chicExportData
chicPlotViewpoint
chicQualityControl
chicSignificantInteractions
chicViewpoint
chicViewpointBackgroundModel
hicAdjustMatrix
hicAggregateContacts
hicAverageRegions
hicBuildMatrix
hicCompareMatrices
hicCompartmentalization
hicConvertFormat
hicCorrectMatrix
hicCorrelate
hicCreateThresholdFile
hicDetectLoops
hicDifferentialTAD
hicexplorer
hicFindEnrichedContacts
hicFindRestSite
hicFindTADs
hicHyperoptDetectLoops
hicHyperoptDetectLoopsHiCCUPS
hicInfo
hicInterIntraTAD
hicMergeDomains
hicMergeLoops
hicMergeMatrixBins
hicMergeTADbins
hicNormalize
hicPCA
hicPlotAverageRegions
hicPlotDistVsCounts
hicPlotMatrix
hicPlotSVL
hicPlotTADs
hicPlotViewpoint
hicQC
hicQuickQC
hicSumMatrices
hicTADClassifier
hicTrainTADClassifier
hicTransform
hicValidateLocations
Module
You can load the modules by:
module load biocontainers
module load hicexplorer
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Hicexplorer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hicexplorer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hicexplorer
Hic-pro
Introduction
Hicpro is an optimized and flexible pipeline for Hi-C data processing.
Versions
3.0.0
3.1.0
Commands
HiC-Pro
digest_genome.py
extract_snps.py
hicpro2fithic.py
hicpro2higlass.sh
hicpro2juicebox.sh
make_viewpoints.py
sparseToDense.py
split_reads.py
split_sparse.py
Module
You can load the modules by:
module load biocontainers
module load hic-pro
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run hic-pro on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hic-pro
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hic-pro
Hifiasm
Introduction
Hifiasm
is a fast haplotype-resolved de novo assembler for PacBio HiFi reads.
Versions
0.16.0
0.18.5
Commands
hifiasm
Module
You can load the modules by:
module load biocontainers
module load hifiasm
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Hifiasm on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hifiasm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hifiasm
HISAT2
Introduction
HISAT2
is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.
Versions
2.2.1
Commands
extract_exons.py
extract_splice_sites.py
hisat2
hisat2-align-l
hisat2-align-s
hisat2-build
hisat2-build-l
hisat2-build-s
hisat2-inspect
hisat2-inspect-l
hisat2-inspect-s
hisat2_extract_exons.py
hisat2_extract_snps_haplotypes_UCSC.py
hisat2_extract_snps_haplotypes_VCF.py
hisat2_extract_splice_sites.py
hisat2_read_statistics.py
hisat2_simulate_reads.py
Module
You can load the modules by:
module load biocontainers
module load hisat2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run HISAT2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hisat2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hisat2
hisat2-build genome.fa genome
# for single-end FASTA reads DNA alignment
hisat2 -f -x genome -U reads.fa -S output.sam --no-spliced-alignment
# for paired-end FASTQ reads alignment
hisat2 -x genome -1 reads_1.fq -2 read2_2.fq -S output.sam
Hmmer
Introduction
Hmmer
is used for searching sequence databases for sequence homologs, and for making sequence alignments.
Versions
3.3.2
Commands
alimask
easel
esl-afetch
esl-alimanip
esl-alimap
esl-alimask
esl-alimerge
esl-alipid
esl-alirev
esl-alistat
esl-compalign
esl-compstruct
esl-construct
esl-histplot
esl-mask
esl-mixdchlet
esl-reformat
esl-selectn
esl-seqrange
esl-seqstat
esl-sfetch
esl-shuffle
esl-ssdraw
esl-translate
esl-weight
hmmalign
hmmbuild
hmmconvert
hmmemit
hmmfetch
hmmlogo
hmmpgmd
hmmpgmd_shard
hmmpress
hmmscan
hmmsearch
hmmsim
hmmstat
jackhmmer
makehmmerdb
nhmmer
nhmmscan
phmmer
Module
You can load the modules by:
module load biocontainers
module load hmmer
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Hmmer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hmmer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hmmer
hmmsearch Nramp.hmm protein.fa > out
HOMMER
Introduction
HOMMER
(Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis. Details about its usage can be found in HOMMER website.
Versions
4.11
Commands
addDataHeader.pl
addData.pl
addGeneAnnotation.pl
addInternalData.pl
addOligos.pl
adjustPeakFile.pl
adjustRedunGroupFile.pl
analyzeChIP-Seq.pl
analyzeRepeats.pl
analyzeRNA.pl
annotateInteractions.pl
annotatePeaks.pl
annotateRelativePosition.pl
annotateTranscripts.pl
assignGeneWeights.pl
assignTSStoGene.pl
batchAnnotatePeaksHistogram.pl
batchFindMotifsGenome.pl
batchFindMotifs.pl
batchMakeHiCMatrix.pl
batchMakeMultiWigHub.pl
batchMakeTagDirectory.pl
batchParallel.pl
bed2DtoUCSCbed.pl
bed2pos.pl
bed2tag.pl
blat2gtf.pl
bridgeResult2Cytoscape.pl
changeNewLine.pl
checkPeakFile.pl
checkTagBias.pl
chopify.pl
chopUpBackground.pl
chopUpPeakFile.pl
cleanUpPeakFile.pl
cleanUpSequences.pl
cluster2bedgraph.pl
cluster2bed.pl
combineGO.pl
combineHubs.pl
compareMotifs.pl
condenseBedGraph.pl
cons2fasta.pl
conservationAverage.pl
conservationPerLocus.pl
convertCoordinates.pl
convertIDs.pl
convertOrganismID.pl
duplicateCol.pl
eland2tags.pl
fasta2tab.pl
fastq2fasta.pl
filterListBy.pl
filterTADsAndCPs.pl
filterTADsAndLoops.pl
findcsRNATSS.pl
findGO.pl
findGOtxt.pl
findHiCCompartments.pl
findHiCDomains.pl
findHiCInteractionsByChr.pl
findKnownMotifs.pl
findMotifsGenome.pl
findMotifs.pl
findRedundantBLAT.pl
findTADsAndLoops.pl
findTopMotifs.pl
flipPC1toMatch.pl
freq2group.pl
genericConvertIDs.pl
GenomeOntology.pl
getChrLengths.pl
getConservedRegions.pl
getDifferentialBedGraph.pl
getDifferentialPeaksReplicates.pl
getDiffExpression.pl
getDistalPeaks.pl
getFocalPeaks.pl
getGenesInCategory.pl
getGWASoverlap.pl
getHiCcorrDiff.pl
getHomerQCstats.pl
getLikelyAdapters.pl
getMappingStats.pl
getPartOfPromoter.pl
getPos.pl
getRandomReads.pl
getSiteConservation.pl
getTopPeaks.pl
gff2pos.pl
go2cytoscape.pl
groupSequences.pl
joinFiles.pl
loadGenome.pl
loadPromoters.pl
makeBigBedMotifTrack.pl
makeBigWig.pl
makeBinaryFile.pl
makeHiCWashUfile.pl
makeMetaGeneProfile.pl
makeMultiWigHub.pl
map-fastq.pl
merge2Dbed.pl
mergeData.pl
motif2Jaspar.pl
motif2Logo.pl
parseGTF.pl
pos2bed.pl
preparseGenome.pl
prepForR.pl
profile2seq.pl
qseq2fastq.pl
randomizeGroupFile.pl
randomizeMotifs.pl
randRemoveBackground.pl
removeAccVersion.pl
removeBadSeq.pl
removeOutOfBoundsReads.pl
removePoorSeq.pl
removeRedundantPeaks.pl
renamePeaks.pl
resizePosFile.pl
revoppMotif.pl
rotateHiCmatrix.pl
runHiCpca.pl
sam2spliceJunc.pl
scanMotifGenomeWide.pl
scrambleFasta.pl
selectRepeatBg.pl
seq2profile.pl
SIMA.pl
subtractBedGraphsDirectory.pl
subtractBedGraphs.pl
tab2fasta.pl
tag2bed.pl
tag2pos.pl
tagDir2bed.pl
tagDir2hicFile.pl
tagDir2HiCsummary.pl
zipHomerResults.pl
Database
Selected database have been downloaded for users.
ORGANISMS
: yeast, worm, mouse, arabidopsis, zebrafish, rat, human and flyPROMOTERS
: yeast, worm, mouse, arabidopsis, zebrafish, rat, human and flyGENOMES
: hg19, hg38, mm10, ce11, dm6, rn6, danRer11, tair10, and sacCer3
Module
You can load the modules by:
module load biocontainers
module load hommer/4.11
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run HOMMER on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=hommer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hommer/4.11
configureHomer.pl -list ## Check the installed database.
findMotifs.pl mouse_geneid.txt mouse motif_out_mouse
findMotifs.pl geneid.txt human motif_out
Homopolish
Introduction
Homopolish is a genome polisher originally developed for Nanopore and subsequently extended for PacBio CLR. It generates a high-quality genome (>Q50) for virus, bacteria, and fungus. Nanopore/PacBio systematic errors are corrected by retreiving homologs from closely-related genomes and polished by an SVM. When paired with Racon and Medaka, the genome quality can reach Q50-90 (>99.999%) on Nanopore R9.4/10.3 flowcells (Guppy >3.4). For PacBio CLR, Homopolish also improves the majority of Flye-assembled genomes to Q90.
Versions
0.4.1
Commands
homopolish
Module
You can load the modules by:
module load biocontainers
module load homopolish
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run homopolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=homopolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers homopolish
How_are_we_stranded_here
Introduction
How_are_we_stranded_here
is a python package for testing strandedness of RNA-Seq fastq files.
Versions
1.0.1
Commands
check_strandedness
Module
You can load the modules by:
module load biocontainers
module load how_are_we_stranded_here
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run How_are_we_stranded_here on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=how_are_we_stranded_here
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers how_are_we_stranded_here
check_strandedness --gtf Homo_sapiens.GRCh38.105.gtf \
--transcripts Homo_sapiens.GRCh38.cds.all.fa \
--reads_1 seq_1.fastq --reads_2 seq_2.fastq
HTSeq
Introduction
HTSeq
is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.
Versions
0.13.5
1.99.2
2.0.1
2.0.2
2.0.2-py310
Commands
htseq-count
htseq-count-barcodes
htseq-qa
python
python3
Module
You can load the modules by:
module load biocontainers
module load htseq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run HTSeq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=htseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers htseq
python -m HTSeq.scripts.count \
-f bam input.bam ref.gtf \
> test.out
Htslib
Introduction
Htslib
is a C library for high-throughput sequencing data formats.
Versions
1.14
1.15
1.16
1.17
Commands
bgzip
htsfile
tabix
Module
You can load the modules by:
module load biocontainers
module load htslib
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Htslib on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=htslib
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers htslib
tabix sorted.gff.gz chr1:10,000,000-20,000,000
Htstream
Introduction
Htstream
is a quality control and processing pipeline for High Throughput Sequencing data.
Versions
1.3.3
Commands
hts_AdapterTrimmer
hts_CutTrim
hts_LengthFilter
hts_NTrimmer
hts_Overlapper
hts_PolyATTrim
hts_Primers
hts_QWindowTrim
hts_SeqScreener
hts_Stats
hts_SuperDeduper
Module
You can load the modules by:
module load biocontainers
module load htstream
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Htstream on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=htstream
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers htstream
HUMAnN 3
Introduction
HUMAnN 3.0
is the next iteration of HUMAnN, the HMP Unified Metabolic Analysis Network. HUMAnN is a method for efficiently and accurately profiling the abundance of microbial metabolic pathways and other molecular functions from metagenomic or metatranscriptomic sequencing data.
Versions
3.0.0
3.6
Commands
humann
humann3
humann3_databases
humann_barplot
humann_benchmark
humann_build_custom_database
humann_config
humann_databases
humann_genefamilies_genus_level
humann_infer_taxonomy
humann_join_tables
humann_reduce_table
humann_regroup_table
humann_rename_table
humann_renorm_table
humann_split_stratified_table
humann_split_table
humann_test
humann_unpack_pathways
Database
Full ChocoPhlAn, UniRef90, EC-filtered UniRef90, UniRef50, EC-filtered UniRef50, and utility_mapping databases have been downloaded for users.
Module
You can load the modules by:
module load biocontainers
module load humann/3.0.0
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run HUMAnN3 on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=humann
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers humann/3.0.0
# Check the database and config by:
humann_config --print
humann --threads 24 --input examples/demo.fastq --output demo_output --metaphlan-options "--bowtie2db /depot/itap/datasets/metaphlan"
Hyphy
Introduction
Hyphy
is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning.
Versions
2.5.36
Commands
hyphy
Module
You can load the modules by:
module load biocontainers
module load hyphy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Hyphy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hyphy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hyphy
Hypo
Introduction
HyPo–a Hybrid Polisher– utilises short as well as long reads within a single run to polish a long reads assembly of small and large genomes. It exploits unique genomic kmers to selectively polish segments of contigs using partial order alignment of selective read-segments. As demonstrated on human genome assemblies, Hypo generates significantly more accurate polished assembly in about one-third time with about half the memory requirements in comparison to contemporary widely used polishers like Racon.
Versions
1.0.3
Commands
hypo
Module
You can load the modules by:
module load biocontainers
module load hypo
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run hypo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hypo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hypo
Idba
Introduction
Idba
is a practical iterative De Bruijn Graph De Novo Assembler for sequence assembly in bioinfomatics.
Versions
1.1.3
Commands
fa2fq
filter_blat
filter_contigs
filterfa
fq2fa
idba
idba_hybrid
idba_tran
idba_tran_test
idba_ud
parallel_blat
parallel_rna_blat
print_graph
raw_n50
run-unittest.py
sample_reads
scaffold
scan.py
shuffle_reads
sim_reads
sim_reads_tran
sort_psl
sort_reads
split_fa
split_fq
split_scaffold
test
validate_blat
validate_blat_parallel
validate_component
validate_contigs_blat
validate_contigs_mummer
validate_reads_blat
validate_rna
Module
You can load the modules by:
module load biocontainers
module load idba
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Idba on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=idba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers idba
fq2fa --paired --filter SRR1977249.abundtrim.subset.pe.fq SRR1977249.abundtrim.subset.pe.fa
idba_ud -r SRR1977249.abundtrim.subset.pe.fa -o output
IGV
Introduction
IGV
(Integrative Genomics Viewer) is a high-performance, easy-to-use, interactive tool for the visual exploration of genomic data.
Versions
2.11.9
2.12.3
Commands
igv_hidpi.sh
igv.sh
Module
You can load the modules by:
module load biocontainers
module load igv
Interactive job
Since IGV requires GUI, it is recommended to run it within ThinLinc:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module --force purge
(base) UserID@bell-a008:~ $ ml biocontainers igv
(base) UserID@bell-a008:~ $ igv.sh
Impute2
Introduction
Impute2
is a genotype imputation and haplotype phasing program.
Versions
2.3.2
Commands
impute2
Module
You can load the modules by:
module load biocontainers
module load impute2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Impute2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=impute2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers impute2
impute2 \
-m Example/example.chr22.map \
-h Example/example.chr22.1kG.haps \
-l Example/example.chr22.1kG.legend \
-g Example/example.chr22.study.gens \
-strand_g Example/example.chr22.study.strand \
-int 20.4e6 20.5e6 \
-Ne 20000 \
-o example.chr22.one.phased.impute2
Infernal
Introduction
Infernal (“INFERence of RNA ALignment”) is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs). A CM is like a sequence profile, but it scores a combination of sequence consensus and RNA secondary structure consensus, so in many cases, it is more capable of identifying RNA homologs that conserve their secondary structure more than their primary sequence. For more information, please check: BioContainers: https://biocontainers.pro/tools/infernal Home page: http://eddylab.org/infernal/
Versions
1.1.4
Commands
cmalign
cmbuild
cmcalibrate
cmconvert
cmemit
cmfetch
cmpress
cmscan
cmsearch
cmstat
Module
You can load the modules by:
module load biocontainers
module load infernal
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run infernal on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=infernal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers infernal
Instrain
Introduction
Instrain
is a python program for analysis of co-occurring genome populations from metagenomes that allows highly accurate genome comparisons, analysis of coverage, microdiversity, and linkage, and sensitive SNP detection with gene localization and synonymous non-synonymous identification.
Versions
1.5.7
1.6.3
Commands
inStrain
Module
You can load the modules by:
module load biocontainers
module load instrain
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Instrain on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=instrain
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers instrain
Intarna
Introduction
Intarna
is a general and fast approach to the prediction of RNA-RNA interactions incorporating both the accessibility of interacting sites as well as the existence of a user-definable seed interaction.
Versions
3.3.1
Commands
IntaRNA
Module
You can load the modules by:
module load biocontainers
module load intarna
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Intarna on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=intarna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers intarna
IntaRNA -t CCCCCCCCGGGGGGGGGGGGGG -q AAAACCCCCCCUUUU
InterProScan
Introduction
InterPro
is a database which integrates together predictive information about proteins’ function from a number of partner resources, giving an overview of the families that a protein belongs to and the domains and sites it contains.
Users who have novel nucleotide or protein sequences that they wish to functionally characterise can use the software package InterProScan
to run the scanning algorithms from the InterPro database in an integrated way. Sequences are submitted in FASTA format. Matches are then calculated against all of the required member database’s signatures and the results are then output in a variety of formats.
Versions
5.54_87.0
5.61-93.0
Commands
interproscan.sh
Database
Latest version of database has been downloaded and setup in /depot/itap/datasets/interproscan-5.54-87.0/data.
Module
You can load the modules by:
module load biocontainers
module load interproscan/5.54_87.0
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run run_dbcan on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=interproscan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers interproscan/5.54_87.0
interproscan.sh -cpu 24 -i test_proteins.fasta
interproscan.sh -cpu 24 -t n -i test_nt_seqs.fasta
IQ-TREE
Introduction
IQ-TREE
is an efficient phylogenomic software by maximum likelihood.
Versions
1.6.12
2.1.2
2.2.0_beta
2.2.2.2
Commands
iqtree
Module
You can load the modules by:
module load biocontainers
module load iqtree
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run IQ-TREE on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=iqtree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers iqtree
iqtree -s input.phy -m GTR+I+G > test.out
Iqtree2
Introduction
IQ-TREE is an efficient phylogenomic software by maximum likelihood.
Versions
2.2.2.6
Commands
iqtree2
Module
You can load the modules by:
module load biocontainers
module load iqtree2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run iqtree2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=iqtree2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers iqtree2
Ismapper
Introduction
ISMapper searches for IS positions in sequence data using paired end Illumina short reads, an IS query/queries of interest and a reference genome. ISMapper reports the IS positions it has found in each isolate, relative to the provided reference genome.
Versions
2.0.2
Commands
ismap
Module
You can load the modules by:
module load biocontainers
module load ismapper
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ismapper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ismapper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ismapper
Isoquant
Introduction
IsoQuant is a tool for the genome-based analysis of long RNA reads, such as PacBio or Oxford Nanopores. IsoQuant allows to reconstruct and quantify transcript models with high precision and decent recall. If the reference annotation is given, IsoQuant also assigns reads to the annotated isoforms based on their intron and exon structure. IsoQuant further performs annotated gene, isoform, exon and intron quantification. If reads are grouped (e.g. according to cell type), counts are reported according to the provided grouping.
Versions
3.1.2
Commands
isoquant.py
Module
You can load the modules by:
module load biocontainers
module load isoquant
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run isoquant on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=isoquant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers isoquant
isoquant.py --reference chr9.4M.fa.gz \
--genedb chr9.4M.gtf.gz \
--fastq chr9.4M.ont.sim.fq.gz \
--data_type nanopore -o test_ont
Isoseq3
Introduction
Isoseq3
- Scalable De Novo Isoform Discovery.
Versions
3.4.0
3.7.0
3.8.2
Commands
isoseq3
Module
You can load the modules by:
module load biocontainers
module load isoseq3
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Isoseq3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=isoseq3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers isoseq3
isoseq3 --version
isoseq3 refine --require-polya \
alz.demult.5p--3p.bam \
primers.fasta alz.flnc.bam
isoseq3 cluster alz.flnc.bam \
alz.polished.bam --verbose --use-qvs
Ivar
Introduction
Ivar is a computational package that contains functions broadly useful for viral amplicon-based sequencing.
Versions
1.3.1
1.4.2
Commands
ivar
Module
You can load the modules by:
module load biocontainers
module load ivar
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ivar on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ivar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ivar
Jcvi
Introduction
Jcvi is a collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics.
Versions
1.2.7
1.3.1
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load jcvi
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run jcvi on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=jcvi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers jcvi
python -m jcvi.formats.fasta format Vvinifera_145_Genoscope.12X.cds.fa.gz grape.cds
python -m jcvi.formats.fasta format Ppersica_298_v2.1.cds.fa.gz peach.cds
python -m jcvi.formats.gff bed --type=mRNA --key=Name --primary_only Vvinifera_145_Genoscope.12X.gene.gff3.gz -o grape.bed
python -m jcvi.compara.catalog ortholog grape peach --no_strip_names
python -m jcvi.graphics.dotplot grape.peach.anchors
rm grape.peach.last.filtered
python -m jcvi.compara.catalog ortholog grape peach --cscore=.99 --no_strip_names
python -m jcvi.graphics.dotplot grape.peach.anchors
python -m jcvi.compara.synteny depth --histogram grape.peach.anchors
python -m jcvi.graphics.grabseeds seeds test-data/test.JPG
Kaiju
Introduction
Kaiju
is a tool for fast taxonomic classification of metagenomic sequencing reads using a protein reference database.
Versions
1.8.2
Commands
kaiju
kaiju-addTaxonNames
kaiju-convertMAR.py
kaiju-convertNR
kaiju-excluded-accessions.txt
kaiju-gbk2faa.pl
kaiju-makedb
kaiju-mergeOutputs
kaiju-mkbwt
kaiju-mkfmi
kaiju-multi
kaiju-taxonlistEuk.tsv
kaiju2krona
kaiju2table
kaijup
kaijux
Module
You can load the modules by:
module load biocontainers
module load kaiju
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Kaiju on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kaiju
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kaiju
kaiju -t kaijudb/nodes.dmp \
-f kaijudb/refseq/kaiju_db_refseq.fmi \
-i input_1.fastq -j input_2.fastq
-z 24
Kakscalculator2
Introduction
kakscalculator2 is a toolkit of incorporating gamma series methods and sliding window strategies.
Versions
2.0.1
Commands
KaKs_Calculator
Module
You can load the modules by:
module load biocontainers
module load kakscalculator2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kakscalculator2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kakscalculator2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kakscalculator2
KaKs_Calculator -i example.axt -o example.axt.kaks -m YN
Kallisto
Introduction
Kallisto
is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment.
Detailed usage can be found here: https://github.com/pachterlab/kallisto
Versions
0.46.2
0.48.0
Commands
kallisto
Module
You can load the modules by:
module load biocontainers
module load kallisto/0.48.0
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kallisto on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kallisto
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kallisto/0.48.0
kallisto index -i transcripts.idx Homo_sapiens.GRCh38.cds.all.fa.gz
kallisto quant -t 24 -i transcripts.idx -o output -b 100 SRR11614709_1.fastq SRR11614709_2.fastq
Kentutils
Introduction
Kentutils: UCSC command line bioinformatic utilities.
Versions
302.1.0
Commands
addCols
ameme
autoDtd
autoSql
autoXml
ave
aveCols
axtChain
axtSort
axtSwap
axtToMaf
axtToPsl
bedClip
bedCommonRegions
bedCoverage
bedExtendRanges
bedGeneParts
bedGraphPack
bedGraphToBigWig
bedIntersect
bedItemOverlapCount
bedPileUps
bedRemoveOverlap
bedRestrictToPositions
bedSort
bedToBigBed
bedToExons
bedToGenePred
bedToPsl
bedWeedOverlapping
bigBedInfo
bigBedNamedItems
bigBedSummary
bigBedToBed
bigWigAverageOverBed
bigWigCat
bigWigCorrelate
bigWigInfo
bigWigMerge
bigWigSummary
bigWigToBedGraph
bigWigToWig
blastToPsl
blastXmlToPsl
calc
catDir
catUncomment
chainAntiRepeat
chainFilter
chainMergeSort
chainNet
chainPreNet
chainSort
chainSplit
chainStitchId
chainSwap
chainToAxt
chainToPsl
checkAgpAndFa
checkCoverageGaps
checkHgFindSpec
checkTableCoords
chopFaLines
chromGraphFromBin
chromGraphToBin
colTransform
countChars
crTreeIndexBed
crTreeSearchBed
dbSnoop
dbTrash
estOrient
faCmp
faCount
faFilter
faFilterN
faFrag
faNoise
faOneRecord
faPolyASizes
faRandomize
faRc
faSize
faSomeRecords
faSplit
faToFastq
faToTab
faToTwoBit
faTrans
fastqToFa
featureBits
fetchChromSizes
findMotif
gapToLift
genePredCheck
genePredHisto
genePredSingleCover
genePredToBed
genePredToFakePsl
genePredToGtf
genePredToMafFrames
gfClient
gfServer
gff3ToGenePred
gff3ToPsl
gmtime
gtfToGenePred
headRest
hgFindSpec
hgGcPercent
hgLoadBed
hgLoadOut
hgLoadWiggle
hgTrackDb
hgWiggle
hgsql
hgsqldump
htmlCheck
hubCheck
ixIxx
lavToAxt
lavToPsl
ldHgGene
liftOver
liftOverMerge
liftUp
linesToRa
linux.x86_64
localtime
mafAddIRows
mafAddQRows
mafCoverage
mafFetch
mafFilter
mafFrag
mafFrags
mafGene
mafMeFirst
mafOrder
mafRanges
mafSpeciesList
mafSpeciesSubset
mafSplit
mafSplitPos
mafToAxt
mafToPsl
mafsInRegion
makeTableList
maskOutFa
mktime
mrnaToGene
netChainSubset
netClass
netFilter
netSplit
netSyntenic
netToAxt
netToBed
newProg
nibFrag
nibSize
oligoMatch
overlapSelect
paraFetch
paraSync
positionalTblCheck
pslCDnaFilter
pslCat
pslCheck
pslDropOverlap
pslFilter
pslHisto
pslLiftSubrangeBlat
pslMap
pslMrnaCover
pslPairs
pslPartition
pslPretty
pslRecalcMatch
pslReps
pslSelect
pslSort
pslStats
pslSwap
pslToBed
pslToChain
pslToPslx
pslxToFa
qaToQac
qacAgpLift
qacToQa
qacToWig
raSqlQuery
raToLines
raToTab
randomLines
rmFaDups
rowsToCols
sizeof
spacedToTab
splitFile
splitFileByColumn
sqlToXml
stringify
subChar
subColumn
tailLines
tdbQuery
textHistogram
tickToDate
toLower
toUpper
trfBig
twoBitDup
twoBitInfo
twoBitMask
twoBitToFa
validateFiles
validateManifest
wigCorrelate
wigEncode
wigToBigWig
wordLine
xmlCat
xmlToSql
Module
You can load the modules by:
module load biocontainers
module load kentutils
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kentutils on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kentutils
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kentutils
Khmer
Introduction
Khmer
is a tool for k-mer counting, filtering, and graph traversal FTW!
Versions
3.0.0a3
Commands
abundance-dist.py
abundance-dist-single.py
annotate-partitions.py
count-median.py
cygdb
cython
cythonize
do-partition.py
extract-long-sequences.py
extract-paired-reads.py
extract-partitions.py
fastq-to-fasta.py
filter-abund.py
filter-abund-single.py
filter-stoptags.py
find-knots.py
interleave-reads.py
load-graph.py
load-into-counting.py
make-initial-stoptags.py
merge-partitions.py
normalize-by-median.py
partition-graph.py
readstats.py
sample-reads-randomly.py
screed
split-paired-reads.py
trim-low-abund.py
unique-kmers.py
Module
You can load the modules by:
module load biocontainers
module load khmer
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Khmer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=khmer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers khmer
Kissde
Introduction
kissDE is a R package, similar to DEseq, but which works on pairs of variants, and tests if a variant is enriched in one condition. It has been developped to work easily with KisSplice output. It can also work with a simple table of counts obtained by any other means. It requires at least two replicates per condition and at least two conditions.
Versions
1.15.3
Commands
R
Rscript
kissDE.R
Module
You can load the modules by:
module load biocontainers
module load kissde
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kissde on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kissde
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kissde
Kissplice
Introduction
KisSplice is a software that enables to analyse RNA-seq data with or without a reference genome. It is an exact local transcriptome assembler that allows to identify SNPs, indels and alternative splicing events. It can deal with an arbitrary number of biological conditions, and will quantify each variant in each condition. It has been tested on Illumina datasets of up to 1G reads. Its memory consumption is around 5Gb for 100M reads.
Versions
2.6.2
Commands
kissplice
Module
You can load the modules by:
module load biocontainers
module load kissplice
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kissplice on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kissplice
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kissplice
Kissplice2refgenome
Introduction
KisSplice can also be used when a reference (annotated) genome is available, in order to annotate the variants found and help prioritize cases to validate experimentally. In this case, the results of KisSplice are mapped to the reference genome, using for instance STAR, and the mapping results are analysed using KisSplice2RefGenome.
Versions
2.0.8
Commands
kissplice2refgenome
Module
You can load the modules by:
module load biocontainers
module load kissplice2refgenome
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kissplice2refgenome on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kissplice2refgenome
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kissplice2refgenome
Kma
Introduction
KMA is a mapping method designed to map raw reads directly against redundant databases, in an ultra-fast manner using seed and extend.
Versions
1.4.3
Commands
kma
kma_index
kma_shm
kma_update
Module
You can load the modules by:
module load biocontainers
module load kma
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kma
Kmc
Introduction
Kmc
is a tool for efficient k-mer counting and filtering of reads based on k-mer content.
Versions
3.2.1
Commands
kmc
kmc_dump
kmc_tools
Module
You can load the modules by:
module load biocontainers
module load kmc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Kmc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kmc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kmc
kmc -k27 seq.fastq 27mers .
Kmergenie
Introduction
KmerGenie estimates the best k-mer length for genome de novo assembly.
Versions
1.7051
Commands
kmergenie
Module
You can load the modules by:
module load biocontainers
module load kmergenie
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kmergenie on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kmergenie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kmergenie
Jellyfish
Introduction
Jellyfish
is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence.
Versions
2.3.0
Commands
jellyfish
Module
You can load the modules by:
module load biocontainers
module load kmer-jellyfish
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Jellyfish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=kmer-jellyfish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kmer-jellyfish
jellyfish count -m 16 -s 100M -t 12 \
-o mer_counts -c 7 input.fastq
KneadData
Introduction
KneadData
is a tool designed to perform quality control on metagenomic and metatranscriptomic sequencing data, especially data from microbiome experiments. In these experiments, samples are typically taken from a host in hopes of learning something about the microbial community on the host.
Detailed usage can be found here: https://huttenhower.sph.harvard.edu/kneaddata/
Versions
0.10.0
Commands
kneaddata
kneaddata_bowtie2_discordant_pairs
kneaddata_build_database
kneaddata_database
kneaddata_read_count_table
kneaddata_test
kneaddata_trf_parallel
Module
You can load the modules by:
module load biocontainers
module load kneaddata
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kneaddata on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kneaddata
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kneaddata
kneaddata --input examples/demo.fastq --reference-db examples/demo_db --output kneaddata_demo_outpu --threads 24 --processes 24
Kover
Introduction
Kover is an out-of-core implementation of rule-based machine learning algorithms that has been tailored for genomic biomarker discovery.
Versions
2.0.6
Commands
kover
Module
You can load the modules by:
module load biocontainers
module load kover
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kover on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kover
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kover
Kraken2
Introduction
Kraken2
is the newest version of Kraken, a taxonomic classification system using exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer.
Detailed usage can be found here: https://ccb.jhu.edu/software/kraken2/
Versions
2.1.2_fixftp
2.1.2
2.1.3
Commands
kraken2
kraken2-build
kraken2-inspect
Module
You can load the modules by:
module load biocontainers
module load kraken2/2.1.2
Download database
Note
There is a known bug in rsync_from_ncbi.pl
(https://github.com/DerrickWood/kraken2/issues/292). When users want to download and build databases by kraken2-build --download-library
, there will an error rsync_from_ncbi.pl: unexpected FTP path(new server?)
. We modifed rsync_from_ncbi.pl
to fix the bug, and created a new module ending with the suffix _fixftp
. Please use this corrected module to download the library.
To download databases, please use the below command:
module load biocontainers
module load kraken2/2.1.2_fixftp
kraken2-build --download-library archaea --db archaea
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kraken2 on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kraken2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kraken2/2.1.2
kraken2 --threads 24 --report kranken2.report --db minikraken2_v2_8GB_201904_UPDATE --paired --classified-out cseqs#.fq SRR5043021_1.fastq SRR5043021_2.fastq
KrakenTools
Introduction
KrakenTools
provides individual scripts to analyze Kraken/Kraken2/Bracken/KrakenUniq output files.
Detailed usage can be found here: https://github.com/jenniferlu717/KrakenTools
Versions
1.2
Commands
alpha_diversity.py
beta_diversity.py
combine_kreports.py
combine_mpa.py
extract_kraken_reads.py
filter_bracken.out.py
fix_unmapped.py
kreport2krona.py
kreport2mpa.py
make_kreport.py
make_ktaxonomy.py
Module
You can load the modules by:
module load biocontainers
module load krakentools/1.2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run krakentools on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=krakentools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers krakentools/1.2
extract_kraken_reads.py -k myfile.kraken -t 2 -s1 SRR5043021_1.fastq -s2 SRR5043021_2.fastq -o extracted1.fq -o2 extracted2.fq
Lambda
Introduction
Lambda
is a local aligner optimized for many query sequences and searches in protein space.
Versions
2.0.0
Commands
lambda2
Module
You can load the modules by:
module load biocontainers
module load lambda
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lambda on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=lambda
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lambda
lambda2 mkindexp -d uniprot_sprot.fasta
lambda2 searchp \
-q proteins.fasta \
-i uniprot_sprot.fasta.lambda
Last
Introduction
Last
is used to find & align related regions of sequences.
Versions
1268
1356
1411
1418
Commands
last-dotplot
last-map-probs
last-merge-batches
last-pair-probs
last-postmask
last-split
last-split5
last-train
lastal
lastal5
lastdb
lastdb5
Module
You can load the modules by:
module load biocontainers
module load last
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Last on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=last
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers last
lastdb humdb humanMito.fa
lastal humdb fuguMito.fa > myalns.maf
Lastz
Introduction
LASTZ - pairwise DNA sequence aligner
Versions
1.04.15
Commands
lastz
lastz_32
lastz_D
Module
You can load the modules by:
module load biocontainers
module load lastz
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run lastz on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=lastz
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lastz
lastz cmc_CFBP8216.fasta cmp_LPPA982.fasta \
--notransition --step=20 --nogapped \
--format=maf > cmc_vs_cmp.maf
Ldhat
Introduction
LDhat is a package written in the C and C++ languages for the analysis of recombination rates from population genetic data.
Versions
2.2a
Commands
convert
pairwise
interval
rhomap
fin
Module
You can load the modules by:
module load biocontainers
module load ldhat
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ldhat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ldhat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ldhat
Ldjump
Introduction
LDJump is an R package to estimate variable recombination rates from population genetic data.
Versions
0.3.1
Commands
R
Rscript
Module
You can load the modules by:
module load biocontainers
module load ldjump
Note
A full path to the Phi file of PhiPack needs to be provided as follows pathPhi = "/opt/PhiPack/Phi"
. In order to use LDhat to quickly calculate some of the summary statistics, please set pathLDhat = "/opt/LDhat/"
.
Interactive job
To run interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers ldjump
(base) UserID@bell-a008:~ $ R
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(LDJump)
> LDJump(seqFullPath, alpha = 0.05, segLength = 1000, pathLDhat = "/opt/LDhat/", pathPhi = "/opt/PhiPack/Phi", format = "fasta", refName = NULL,
start = NULL, constant = F, status = T, cores = 1, accept = F, demography = F, out = "")
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ldjump on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ldjump
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ldjump
Rscript script.R
Ldsc
Introduction
ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics.
Versions
1.0.1
Commands
ldsc.py
munge_sumstats.py
Module
You can load the modules by:
module load biocontainers
module load ldsc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ldsc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ldsc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ldsc
Liftoff
Introduction
Liftoff
is an accurate GFF3/GTF lift over pipeline.
Versions
1.6.3
Commands
liftoff
python
python3
Module
You can load the modules by:
module load biocontainers
module load liftoff
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Liftoff on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=liftoff
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers liftoff
liftoff -g reference.gff3 -o target.gff3 \
-chroms chr_pairs.txt target.fasta reference.fa
Liftofftools
Introduction
LiftoffTools is a toolkit to compare genes lifted between genome assemblies. Specifically it is designed to compare genes lifted over using Liftoff although it is also compatible with other lift-over tools such as UCSC liftOver as long as the feature IDs are the same. LiftoffTools provides 3 different modules. The first identifies variants in protein-coding genes and their effects on the gene. The second compares the gene synteny, and the third clusters genes into groups of paralogs to evaluate gene copy number gain and loss. The input for all modules is the reference genome assembly (FASTA), target genome assembly (FASTA), reference annotation (GFF/GTF), and target annotation (GFF/GTF).
Versions
0.4.4
Commands
liftofftools
Module
You can load the modules by:
module load biocontainers
module load liftofftools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run liftofftools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=liftofftools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers liftofftools
Lima
Introduction
Lima
is the standard tool to identify barcode and primer sequences in PacBio single-molecule sequencing data.
Versions
2.2.0
Commands
lima
Module
You can load the modules by:
module load biocontainers
module load lima
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lima on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=lima
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lima
lima --version
lima --isoseq --dump-clips \
--peek-guess -j 12 \
alz.ccs.bam primers.fasta \
alz.demult.bam
Links
Introduction
LINKS is a genomics application for scaffolding genome assemblies with long reads, such as those produced by Oxford Nanopore Technologies Ltd. It can be used to scaffold high-quality draft genome assemblies with any long sequences (eg. ONT reads, PacBio reads, other draft genomes, etc). It is also used to scaffold contig pairs linked by ARCS/ARKS.
Versions
2.0.1
Commands
LINKS
Module
You can load the modules by:
module load biocontainers
module load links
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run links on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=links
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers links
Lofreq
Introduction
Lofreq
is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data.
Versions
2.1.5
Commands
lofreq
Module
You can load the modules by:
module load biocontainers
module load lofreq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lofreq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=lofreq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lofreq
lofreq call -f ref.fa -o vars.vcf out_sorted.bam
lofreq call-parallel --pp-threads 8 \
-f ref.fa -o vars_pallel.vcf out_sorted.bam
Longphase
Introduction
LongPhase is an ultra-fast program for simultaneously co-phasing SNPs and SVs by using Nanopore and PacBio long reads. It is capable of producing nearly chromosome-scale haplotype blocks by using Nanpore ultra-long reads without the need for additional trios, chromosome conformation, and strand-seq data. On an 8-core machine, LongPhase can finish phasing a human genome in 10-20 minutes.
Versions
1.4
Commands
longphase
Module
You can load the modules by:
module load biocontainers
module load longphase
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run longphase on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=longphase
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers longphase
longphase phase \
-s SNP.vcf \
--sv-file SV.vcf \
-b alignment.bam \
-r reference.fasta \
-t 8 \
-o phased_prefix \
--ont # or --pb for PacBio Hifi
Longqc
Introduction
LongQC is a tool for the data quality control of the PacBio and ONT long reads.
Versions
1.2.0c
Commands
longQC.py
Module
You can load the modules by:
module load biocontainers
module load longqc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run longqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=longqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers longqc
longQC.py sampleqc -x pb-rs2 -o out_dir seq.fastq
Lra
Introduction
Lra
is a sequence alignment program that aligns long reads from single-molecule sequencing (SMS) instruments, or megabase-scale contigs from SMS assemblies.
Versions
1.3.2
Commands
lra
Module
You can load the modules by:
module load biocontainers
module load lra
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lra on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=lra
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lra
lra index genome.fasta
lra align genome.fasta input.fastq -t 12 -p s > output.sam
Ltr_finder
Introduction
LTR_Finder is an efficient program for finding full-length LTR retrotranspsons in genome sequences.
Versions
1.07
Commands
ltr_finder
check_result.pl
down_tRNA.pl
filter_rt.pl
genome_plot.pl
genome_plot2.pl
genome_plot_svg.pl
Module
You can load the modules by:
module load biocontainers
module load ltr_finder
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ltr_finder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ltr_finder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ltr_finder
ltr_finder 3ds_72.fa -P 3ds_72 -w2 > test/3ds_72_result.txt \
| genome_plot.pl test/
Ltrpred
Introduction
LTRpred(ict): de novo annotation of young and intact retrotransposons.
Versions
1.1.0
Commands
R
Rscript
Module
You can load the modules by:
module load biocontainers
module load ltrpred
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ltrpred on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ltrpred
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ltrpred
Lumpy-sv
Introduction
Lumpy-sv
is a general probabilistic framework for structural variant discovery.
Versions
0.3.1
Commands
lumpy
lumpyexpress
Module
You can load the modules by:
module load biocontainers
module load lumpy-sv
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lumpy-sv on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=lumpy-sv
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lumpy-sv
lumpy -mw 4 -tt 0.0 -pe \
bam_file:AL87.discordant.sort.bam,histo_file:AL87.histo,mean:429,stdev:84,read_length:83,min_non_overlap:83,discordant_z:4,back_distance:1,weight:1,id:1,min_mapping_threshold:20 \
-sr bam_file:AL87.sr.sort.bam,back_distance:1,weight:1,id:2,min_mapping_threshold:20
Lyveset
Introduction
Lyveset is a method of using hqSNPs to create a phylogeny, especially for outbreak investigations.
Versions
2.0.1
Commands
applyFstToTree.pl
cladeDistancesFromTree.pl
clusterPairwise.pl
convertAlignment.pl
downloadDataset.pl
errorProneRegions.pl
filterMatrix.pl
filterVcf.pl
genomeDist.pl
launch_bwa.pl
launch_set.pl
launch_smalt.pl
launch_snap.pl
launch_snpeff.pl
launch_varscan.pl
makeRegions.pl
matrixToAlignment.pl
pairwiseDistances.pl
pairwiseTo2d.pl
removeUninformativeSites.pl
removeUninformativeSitesFromMatrix.pl
run_assembly_isFastqPE.pl
run_assembly_metrics.pl
run_assembly_readMetrics.pl
run_assembly_removeDuplicateReads.pl
run_assembly_shuffleReads.pl
run_assembly_trimClean.pl
set_bayesHammer.pl
set_diagnose.pl
set_diagnose_msa.pl
set_downloadTestData.pl
set_findCliffs.pl
set_findPhages.pl
set_indexCase.pl
set_manage.pl
set_processPooledVcf.pl
set_samtools_depth.pl
set_test.pl
shuffleSplitReads.pl
snpDistribution.pl
vcfToAlignment.pl
vcfutils.pl
Module
You can load the modules by:
module load biocontainers
module load lyveset
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run lyveset on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=lyveset
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lyveset
set_test.pl lambda
set_manage.pl --create setTest
Macrel
Introduction
Macrel is a pipeline to mine antimicrobial peptides (AMPs) from (meta)genomes.
Versions
1.2.0
Commands
macrel
Module
You can load the modules by:
module load biocontainers
module load macrel
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run macrel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=macrel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers macrel
MACS2
Introduction
MACS2
is Model-based Analysis of ChIP-Seq for identifying transcript factor binding sites.
Versions
2.2.7.1
Commands
macs2
Module
You can load the modules by:
module load biocontainers
module load macs2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MACS2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=macs2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers macs2
macs2 callpeak -t ChIP.bam -c Control.bam -f BAM -g hs -n test -B -q 0.01
Macs3
Introduction
MACS3
is Model-based Analysis of ChIP-Seq for identifying transcript factor.
Versions
3.0.0a6
Commands
macs3
Module
You can load the modules by:
module load biocontainers
module load macs3
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Macs3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=macs3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers macs3
macs3 callpeak -t ChIP.bam -c Control.bam -f BAM -g hs -n test -B -q 0.01
MAFFT
Introduction
MAFFT
is a multiple alignment program for amino acid or nucleotide sequences.
Versions
7.475
7.490
Commands
einsi
fftns
fftnsi
ginsi
linsi
mafft
mafft-distance
mafft-einsi
mafft-fftns
mafft-fftnsi
mafft-ginsi
mafft-homologs.rb
mafft-linsi
mafft-nwns
mafft-nwnsi
mafft-profile
mafft-qinsi
mafft-sparsecore.rb
mafft-xinsi
nwns
nwnsi
Module
You can load the modules by:
module load biocontainers
module load mafft
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MAFFT on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mafft
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mafft
Mageck
Introduction
Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout (MAGeCK) is a computational tool to identify important genes from the recent genome-scale CRISPR-Cas9 knockout screens (or GeCKO) technology.
Versions
0.5.9.5
Commands
mageck
mageckGSEA
RRA
Module
You can load the modules by:
module load biocontainers
module load mageck
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mageck on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mageck
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mageck
mageck count -l library.txt -n demo \
--sample-label L1,CTRL \
--fastq test1.fastq test2.fastq
mageck test -k demo.count.txt \
-t L1 -c CTRL -n demo
Magicblast
Introduction
Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Each alignment optimizes a composite score, taking into account simultaneously the two reads of a pair, and in case of RNA-seq, locating the candidate introns and adding up the score of all exons. This is very different from other versions of BLAST, where each exon is scored as a separate hit and read-pairing is ignored.
Versions
1.5.0
Commands
magicblast
Module
You can load the modules by:
module load biocontainers
module load magicblast
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run magicblast on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=magicblast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers magicblast
MAKER
Introduction
MAKER
is a popular genome annotation pipeline for both prokaryotic and eukaryotic genomes. This guide describes best practices for running MAKER on RCAC clusters. For detailed information about MAKER, see its offical website (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018).
Versions
2.31.11
3.01.03
Commands
cegma2zff
chado2gff3
compare
cufflinks2gff3
evaluator
fasta_merge
fasta_tool
genemark_gtf2gff3
gff3_merge
iprscan2gff3
iprscan_wrap
ipr_update_gff
maker
maker2chado
maker2eval_gtf
maker2jbrowse
maker2wap
maker2zff
maker_functional
maker_functional_fasta
maker_functional_gff
maker_map_ids
map2assembly
map_data_ids
map_fasta_ids
map_gff_ids
tophat2gff3
Module
You can load the modules by:
module load biocontainers
module load maker/2.31.11 # OR maker/3.01.03
Note
Dfam release 3.5
(October 2021) downloaded from Dfam website (https://www.dfam.org/home) that required by RepeatMasker
has been set up for users. The RepeatMakser
library is stored here /depot/itap/datasets/Maker/RepeatMasker/Libraries
.
Prerequisites
After loading MAKER modules, users can create MAKER control files by the folowing comand:
maker -CTL
This will generate three files:
maker_opts.ctl (required to be modified)
maker_exe.ctl (do not need to modify this file)
maker_bopts.ctl (optionally modify this file)
maker_opts.ctl: - If not using RepeatMasker, modify
model_org=all
tomodel_org=
- If not using RepeatMasker, modifymodel_org=all
to an appropriate family/genus/species.
Example job non-mpi
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MAKER on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=MAKER
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers maker/2.31.11 # or maker/3.01.03
maker -c 24
Example job mpi
To use MAKER in MPI mode, we cannot use the maker modules. Instead we have to use the singularity image files stored in /apps/biocontainers/images
:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 5:00:00
#SBATCH -N 2
#SBATCH -n 24
#SBATCH -c 8
#SBATCH --job-name=MAKER_mpi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --mail-user=UserID@purdue.edu
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
## MAKER2
mpirun -n 24 singularity exec /apps/biocontainers/images/maker_2.31.11.sif maker -c 8
## MAKER3
mpirun -n 24 singularity exec /apps/biocontainers/images/maker_3.01.03.sif maker -c 8
Manta
Introduction
Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads.
Versions
1.6.0
Commands
configManta.py
python
Module
You can load the modules by:
module load biocontainers
module load manta
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run manta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=manta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers manta
configManta.py --normalBam=HCC1954.NORMAL.30x.compare.COST16011_region.bam \
--tumorBam=G15512.HCC1954.1.COST16011_region.bam \
--referenceFasta=Homo_sapiens_assembly19.COST16011_region.fa \
--region=8:107652000-107655000 \
--region=11:94974000-94989000 \
--exome --runDir="MantaDemoAnalysis"
python MantaDemoAnalysis/runWorkflow.py
Mapcaller
Introduction
Mapcaller
is an efficient and versatile approach for short-read mapping and variant identification using high-throughput sequenced data.
Versions
0.9.9.41
Commands
MapCaller
Module
You can load the modules by:
module load biocontainers
module load mapcaller
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mapcaller on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=mapcaller
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mapcaller
MapCaller index ref.fasta ref
MapCaller -t 12 -i ref -f input_1.fastq -f2 input_2.fastq -vcf out.vcf
Mapdamage2
Introduction
mapDamage2 is a computational framework written in Python and R, which tracks and quantifies DNA damage patterns among ancient DNA sequencing reads generated by Next-Generation Sequencing platforms.
Versions
2.2.1
Commands
mapDamage
Module
You can load the modules by:
module load biocontainers
module load mapdamage2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mapdamage2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mapdamage2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mapdamage2
Marginpolish
Introduction
MarginPolish is a graph-based assembly polisher. It iteratively finds multiple probable alignment paths for run-length-encoded reads and uses these to generate a refined sequence. It takes as input a FASTA assembly and an indexed BAM (ONT reads aligned to the assembly), and it produces a polished FASTA assembly.
Versions
0.1.3
Commands
marginpolish
Module
You can load the modules by:
module load biocontainers
module load marginpolish
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run marginpolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 32
#SBATCH --job-name=marginpolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers marginpolish
marginpolish \
Reads_to_assembly_StaphAur.bam \
Draft_assembly_StaphAur.fasta \
helen_modles/MP_r941_guppy344_microbial.json \
-t 32 \
-o mp_output/mp_images \
-f
Mash
Introduction
Mash
is a fast sequence distance estimator that uses MinHash.
Versions
2.3
Commands
mash
Module
You can load the modules by:
module load biocontainers
module load mash
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mash on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mash
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mash
mash dist genome1.fasta genome2.fasta
Mashmap
Introduction
Mashmap
is a fast approximate aligner for long DNA sequences.
Versions
2.0-pl5321
Commands
mashmap
Module
You can load the modules by:
module load biocontainers
module load mashmap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mashmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=mashmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mashmap
mashmap -r ref.fasta -t 12 -q input.fasta
Mashtree
Introduction
Mashtree
is a tool to create a tree using Mash distances.
Versions
1.2.0
Commands
mashtree
mashtree_bootstrap.pl
mashtree_cluster.pl
mashtree_init.pl
mashtree_jackknife.pl
mashtree_wrapper_deprecated.pl
Module
You can load the modules by:
module load biocontainers
module load mashtree
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mashtree on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mashtree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mashtree
Masurca
Introduction
The MaSuRCA (Maryland Super Read Cabog Assembler) genome assembly and analysis toolkit contains of MaSuRCA genome assembler, QuORUM error corrector for Illumina data, POLCA genome polishing software, Chromosome scaffolder, jellyfish mer counter, and MUMmer aligner.
Versions
4.0.9
4.1.0
Commands
masurca
build_human_reference.sh
chromosome_scaffolder.sh
close_gaps.sh
close_scaffold_gaps.sh
correct_with_k_unitigs.sh
deduplicate_contigs.sh
deduplicate_unitigs.sh
eugene.sh
extract_chrM.sh
filter_library.sh
final_polish.sh
fix_unitigs.sh
fragScaff.sh
mega_reads_assemble_cluster.sh
mega_reads_assemble_cluster2.sh
mega_reads_assemble_polish.sh
mega_reads_assemble_ref.sh
parallel_delta-filter.sh
polca.sh
polish_with_illumina_assembly.sh
recompute_astat_superreads.sh
recompute_astat_superreads_CA8.sh
reconcile_alignments.sh
refine.sh
resolve_trio.sh
run_ECR.sh
samba.sh
splitScaffoldsAtNs.sh
Module
You can load the modules by:
module load biocontainers
module load masurca
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run masurca on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=masurca
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers masurca
Mauve
Introduction
Mauve
is a system for constructing multiple genome alignments in the presence of large-scale evolutionary events such as rearrangement and inversion.
Versions
2.4.0
Commands
mauveAligner
progressiveMauve
Module
You can load the modules by:
module load biocontainers
module load mauve
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mauve on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mauve
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mauve
mauveAligner seqs.fasta --output=mauveAligner_output
progressiveMauve --output=threeway.xmfa \
--output-guide-tree=threeway.tree \
--backbone-output=threeway.backbone genome1.gbk genome2.gbk genome3.gbk
Maxbin2
Introduction
Maxbin2 is a software for binning assembled metagenomic sequences based on an Expectation-Maximization algorithm.
Versions
2.2.7
Commands
run_MaxBin.pl
run_FragGeneScan.pl
Module
You can load the modules by:
module load biocontainers
module load maxbin2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run maxbin2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=maxbin2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers maxbin2
run_MaxBin.pl -contig subset_assembly.fa \
-abund_list abundance.list -max_iteration 5 -out mbin
Maxquant
Introduction
Maxquant
is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. It is specifically aimed at high-resolution MS data.
Versions
2.1.0.0
2.1.3.0
2.1.4.0
2.3.1.0
Commands
MaxQuantGui.exe
MaxQuantCmd.exe
Module
You can load the modules by:
module load biocontainers
module load maxquant
GUI
To run Maxquant with GUI, it is recommended to run within ThinLinc:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers maxquant
(base) UserID@bell-a008:~ $ MaxQuantGui.exe

CMD job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Maxquant without GUI on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=maxquant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers maxquant
MaxQuantCmd.exe mqpar.xml
Mcl
Introduction
Mcl
is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm for graphs.
Versions
14.137-pl5262
Commands
clm
clmformat
clxdo
mcl
mclblastline
mclcm
mclpipeline
mcx
mcxarray
mcxassemble
mcxdeblast
mcxdump
mcxi
mcxload
mcxmap
mcxrand
mcxsubs
Module
You can load the modules by:
module load biocontainers
module load mcl
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mcl on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mcl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mcl
Mcscanx

Introduction
The MCScanX package has two major components: a modified version of MCscan algorithm allowing users to handle MCScan more conveniently and to view multiple alignment of syntenic blocks more clearly, and a variety of downstream analysis tools to conduct different biological analyses based on the synteny data generated by the modified MCScan algorithm.
Versions
default
Commands
MCScanX
MCScanX_h
duplicate_gene_classifier
add_ka_and_ks_to_collinearity
add_kaks_to_synteny
detect_collinearity_within_gene_families
detect_synteny_within_gene_families
group_collinear_genes
group_syntenic_genes
origin_enrichment_analysis
Module
You can load the modules by:
module load biocontainers
module load mcscanx
Helper command
Note
To conduct downstream analyses, users need to copy the folder downstream_analyses
from container into the host system.
A helper command copy_downstream_analyses
is provided to simplify the task. Follow the procedure below to copy downstream_analyses into target directory:
$ copy_downstream_analyses $PWD # this will copy the downstream_analyses into the current directory.
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mcscanx on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mcscanx
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mcscanx
## Run MCScanX
MCScanX Result/merge
## Copy downstream_analyses
copy_downstream_analyses $PWD
## Downstream analyses
java circle_plotter -g ../Result/merge.gff -s ../Result/merge.collinearity -c ../Result/merge_circ.ctl -o ../Result/merge_circle.png
java dot_plotter -g ../Result/merge.gff -s ../Result/merge.collinearity -c ../Result/merge_dot.ctl -o ../Result/merge_dot.png
java dual_synteny_plotter -g ../Result/merge.gff -s ../Result/merge.collinearity -c ../Result/merge_dot.ctl -o ../Result/merge_dual_synteny.png
Medaka
Introduction
Medaka
is a tool to create consensus sequences and variant calls from nanopore sequencing data.
Versions
1.6.0
Commands
medaka
medaka_consensus
medaka_counts
medaka_data_path
medaka_haploid_variant
medaka_version_report
Module
You can load the modules by:
module load biocontainers
module load medaka
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Medaka on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=medaka
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers medaka
Megadepth
Introduction
Megadepth
is an efficient tool for extracting coverage related information from RNA and DNA-seq BAM and BigWig files.
Versions
1.2.0
Commands
megadepth
Module
You can load the modules by:
module load biocontainers
module load megadepth
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Megadepth on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=megadepth
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers megadepth
megadepth sorted.bam
Megahit
Introduction
Megahit
is a ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.
Versions
1.2.9
Commands
megahit
Module
You can load the modules by:
module load biocontainers
module load megahit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Megahit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=megahit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers megahit
megahit --12 SRR1976948.abundtrim.subset.pe.fq.gz,SRR1977249.abundtrim.subset.pe.fq.gz -o combined
Megan
Introduction
Megan
is a computer program that allows optimized analysis of large metagenomic datasets. Metagenomics is the analysis of the genomic sequences from a usually uncultured environmental sample.
Versions
6.21.7
Commands
MEGAN
blast2lca
blast2rma
daa2info
daa2rma
daa-meganizer
gc-assembler
rma2info
sam2rma
references-annotator
Module
You can load the modules by:
module load biocontainers
module load megan
GUI
To run MEGAN with GUI, it is recommended to run within ThinLinc:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers megan
(base) UserID@bell-a008:~ $ MEGAN

Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Megan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=megan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers megan
Meme
Introduction
Meme
is a collection of tools for the discovery and analysis of sequence motifs.
Versions
5.3.3
5.4.1
5.5.0
Commands
ame
centrimo
dreme
dust
fimo
glam2
glam2scan
gomo
mast
mcast
meme
meme-chip
momo
purge
spamo
tomtom
Module
You can load the modules by:
module load biocontainers
module load meme
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Meme on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=meme
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers meme
meme seq.fasta -dna -mod oops -pal
meme-chip Klf1.fna -o memechip_klf1_out
Memes
Introduction
memes is an R interface to the MEME Suite family of tools, which provides several utilities for performing motif analysis on DNA, RNA, and protein sequences. memes works by detecting a local install of the MEME suite, running the commands, then importing the results directly into R.
Versions
1.1.2
Commands
R
Module
You can load the modules by:
module load biocontainers
module load memes
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run memes on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=memes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers memes
Meraculous
Introduction
Meraculous is a whole genome assembler for Next Generation Sequencing data, geared for large genomes. It is hybrid k-mer/read-based approach capitalizes on the high accuracy of Illumina sequence by eschewing an explicit error correction step which we argue to be redundant with the assembly process. Meraculous achieves high performance with large datasets by utilizing lightweight data structures and multi-threaded parallelization, allowing to assemble human-sized genomes on a high-cpu cluster in under a day. The process pipeline implements a highly transparent and portable model of job control and monitoring where different assembly stages can be executed and re-executed separately or in unison on a wide variety of architectures.
Versions
2.2.6
Commands
run_meraculous.sh
blastMapAnalyzer2.pl
bmaToLinks.pl
_bubbleFinder2.pl
bubblePopper.pl
bubbleScout.pl
contigBias.pl
divide_it.pl
fasta_splitter.pl
findDMin2.pl
gapDivider.pl
gapPlacer.pl
haplotyper.Naive.pl
haplotyper.pl
histogram2.pl
kmerHistAnalyzer.pl
loadBalanceMers.pl
meraculous4h.pl
meraculous.pl
N50.pl
_oNo4.pl
oNo7.pl
optimize2.pl
randomList2.pl
scaffold2contig.pl
scaffReportToFasta.pl
screen_list2.pl
spanner.pl
splinter.pl
splinter_scaffolds.pl
split_and_validate_reads.pl
test_dependencies.pl
unique.pl
Module
You can load the modules by:
module load biocontainers
module load meraculous
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run meraculous on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=meraculous
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers meraculous
Merqury
Introduction
Merqury is a tool to evaluate genome assemblies with k-mers and more.
Versions
1.3
Commands
merqury.sh
Module
You can load the modules by:
module load biocontainers
module load merqury
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run merqury on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=merqury
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers merqury
merqury.sh F1.k18.meryl col0.hapmer.meryl cvi0.hapmer.meryl \
athal_COL.fasta athal_CVI.fasta test
Meryl
Introduction
Meryl
is a genomic k-mer counter (and sequence utility) with nice features.
Versions
1.3
Commands
meryl
meryl-analyze
meryl-import
meryl-lookup
meryl-simple
Module
You can load the modules by:
module load biocontainers
module load meryl
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Meryl on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=meryl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers meryl
meryl count k=42 data/ec.fna.gz output ec.meryl
Metabat
Introduction
Metabat
is a robust statistical framework for reconstructing genomes from metagenomic data.
Versions
2.15-5
Commands
aggregateBinDepths.pl
aggregateContigOverlapsByBin.pl
contigOverlaps
jgi_summarize_bam_contig_depths
merge_depths.pl
metabat
metabat1
metabat2
runMetaBat.sh
Module
You can load the modules by:
module load biocontainers
module load metabat
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Metabat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=metabat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metabat
metabat2 -m 10000 \
-t 24 \
-i contig.fasta \
-o metabat2_output \
-a depth.txt
Metachip
Introduction
Metachip is a pipeline for Horizontal gene transfer (HGT) identification.
Versions
1.10.12
Commands
MetaCHIP
Module
You can load the modules by:
module load biocontainers
module load metachip
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run metachip on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=metachip
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metachip
MetaPhlAn 3
Introduction
MetaPhlAn
(Metagenomic Phylogenetic Analysis) is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:
up to 25,000 reads-per-second (on one CPU) analysis speed (orders of magnitude faster compared to existing methods);
unambiguous taxonomic assignments as the MetaPhlAn markers are clade-specific;
accurate estimation of organismal relative abundance (in terms of number of cells rather than fraction of reads);
species-level resolution for bacteria, archaea, eukaryotes and viruses;
extensive validation of the profiling accuracy on several synthetic datasets and on thousands of real metagenomes.
Versions
3.0.14
3.0.9
4.0.2
Commands
metaphlan
Database
The lastest version of database(mpa_v30) has been downloaded and built in /depot/itap/datasets/metaphlan/
.
Module
You can load the modules by:
module load biocontainers
module load metaphlan/3.0.14
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MetaPhlAn on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=MetaPhlAn
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metaphlan/3.0.14
DATABASE=/depot/itap/datasets/metaphlan/
metaphlan SRR11234553_1.fastq,SRR11234553_2.fastq --input_type fastq --nproc 24 -o profiled_metagenome.txt --bowtie2db $DATABASE --bowtie2out metagenome.bowtie2.bz2
Metaseq
Introduction
Metaseq is a Python package for integrative genome-wide analysis reveals relationships between chromatin insulators and associated nuclear mRNA.
Versions
0.5.6
Commands
python
python2
Module
You can load the modules by:
module load biocontainers
module load metaseq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run metaseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=metaseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metaseq
Methyldackel
Introduction
MethylDackel (formerly named PileOMeth, which was a temporary name derived due to it using a PILEup to extract METHylation metrics) will process a coordinate-sorted and indexed BAM or CRAM file containing some form of BS-seq alignments and extract per-base methylation metrics from them. MethylDackel requires an indexed fasta file containing the reference genome as well.
Versions
0.6.1
Commands
MethylDackel
Module
You can load the modules by:
module load biocontainers
module load methyldackel
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run methyldackel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=methyldackel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers methyldackel
MethylDackel extract chgchh.fa chgchh_aln.bam
Metilene
Introduction
Metilene is a versatile tool to study the effect of epigenetic modifications in differentiation/development, tumorigenesis, and systems biology on a global, genome-wide level.
Versions
0.2.8
Commands
metilene
metilene_input.pl
metilene_output.pl
metilene_output.R
Module
You can load the modules by:
module load biocontainers
module load metilene
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run metilene on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=metilene
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metilene
metilene -a g1 -b g2 methylation-file
Mhm2
Introduction
MetaHipMer is a de novo metagenome short-read assembler. Version 2 (MHM2) is written entirely in UPC++ and runs efficiently on both single servers and on multinode supercomputers, where it can scale up to coassemble terabase-sized metagenomes.
Versions
2.0.0
Commands
mhm2.py
Module
You can load the modules by:
module load biocontainers
module load mhm2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mhm2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mhm2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mhm2
mhm2.py -r input_1.fastq,input_2.fastq
MicrobeDMM
Introduction
MicrobeDMM
is a suite of programs used for empirical Bayes fitting of DMM models.
Versions
1.0
Commands
DirichletMixtureGHPFit
Module
You can load the modules by:
module load biocontainers
module load microbedmm
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MicrobeDMM on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=microbedmm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers microbedmm
Minialign
Introduction
Minialign
is a little bit fast and moderately accurate nucleotide sequence alignment tool designed for PacBio and Nanopore long reads.
Versions
0.5.3
Commands
minialign
Module
You can load the modules by:
module load biocontainers
module load minialign
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Minialign on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=minialign
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers minialign
minialign -d index.mai genome.fasta
minialign -l index.mai input.fastq > out.sam
Miniasm
Introduction
Miniasm
is a very fast OLC-based de novo assembler for noisy long reads.
Versions
0.3_r179
Commands
miniasm
minidot
Module
You can load the modules by:
module load biocontainers
module load miniasm
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Miniasm on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=miniasm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers miniasm
miniasm -f Elysia_ont_test.fq Elysia_reads.paf.gz \
> Elysia_reads.gfa
Minimap2
Introduction
Minimap2
is a versatile pairwise aligner for genomic and spliced nucleotide sequences.
Versions
2.22
2.24
2.26
Commands
minimap2
paftools.js
k8
Module
You can load the modules by:
module load biocontainers
module load minimap2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Minimap2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=minimap2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers minimap2
minimap2 -ax sr Wuhan-Hu-1.fasta \
seq_1.fastq seq_2.fastq \
> aln.sam
Minipolish
Introduction
Minipolish is a tool for Racon polishing of miniasm assemblies.
Versions
0.1.3
Commands
minipolish
Module
You can load the modules by:
module load biocontainers
module load minipolish
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run minipolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=minipolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers minipolish
minipolish -t 8 long_reads.fastq.gz assembly.gfa > polished.gfa
Miniprot
Introduction
Miniprot aligns a protein sequence against a genome with affine gap penalty, splicing and frameshift. It is primarily intended for annotating protein-coding genes in a new species using known genes from other species. Miniprot is similar to GeneWise and Exonerate in functionality but it can map proteins to whole genomes and is much faster at the residue alignment step.
Versions
0.3
0.7
Commands
miniprot
Module
You can load the modules by:
module load biocontainers
module load miniprot
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run miniprot on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=miniprot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers miniprot
miRDeep2
Introduction
miRDeep2
discovers active known or novel miRNAs from deep sequencing data (Solexa/Illumina, 454, …).
Versions
2.0.1.3
Commands
bwa_sam_converter.pl
clip_adapters.pl
collapse_reads_md.pl
convert_bowtie_output.pl
excise_precursors_iterative_final.pl
excise_precursors.pl
extract_miRNAs.pl
fastaparse.pl
fastaselect.pl
fastq2fasta.pl
find_read_count.pl
geo2fasta.pl
get_mirdeep2_precursors.pl
illumina_to_fasta.pl
make_html2.pl
make_html.pl
mapper.pl
mirdeep2bed.pl
miRDeep2_core_algorithm.pl
miRDeep2.pl
parse_mappings.pl
perform_controls.pl
permute_structure.pl
prepare_signature.pl
quantifier.pl
remove_white_space_in_id.pl
rna2dna.pl
samFLAGinfo.pl
sam_reads_collapse.pl
sanity_check_genome.pl
sanity_check_mapping_file.pl
sanity_check_mature_ref.pl
sanity_check_reads_ready_file.pl
select_for_randfold.pl
survey.pl
Module
You can load the modules by:
module load biocontainers
module load mirdeep2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run miRDeep2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mirdeep2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mirdeep2
Mirtop
Introduction
Mirtop is a ommand line tool to annotate with a standard naming miRNAs e isomiRs.
Versions
0.4.25
Commands
mirtop
Module
You can load the modules by:
module load biocontainers
module load mirtop
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mirtop on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mirtop
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mirtop
mirtop gff --format prost --sps hsa
--hairpin examples/annotate/hairpin.fa \
--gtf examples/annotate/hsa.gff3 \
-o test_out \
examples/prost/prost.example.txt
Mitofinder
Introduction
Mitofinder
is a pipeline to assemble mitochondrial genomes and annotate mitochondrial genes from trimmed read sequencing data.
Versions
1.4.1
Commands
mitofinder
Module
You can load the modules by:
module load biocontainers
module load mitofinder
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mitofinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mitofinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mitofinder
mitofinder -j Aphaenogaster_megommata_SRR1303315 \
-1 Aphaenogaster_megommata_SRR1303315_R1_cleaned.fastq.gz \
-2 Aphaenogaster_megommata_SRR1303315_R2_cleaned.fastq.gz \
-r reference.gb -o 5 -p 5 -m 10
Mlst
Introduction
Mlst is used to scan contig files against traditional PubMLST typing schemes.
Versions
2.22.0
2.23.0
Commands
mlst
Module
You can load the modules by:
module load biocontainers
module load mlst
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mlst on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mlst
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mlst
mlst contigs.fa
mlst genome.gbk.gz
Mmseqs2
Introduction
Mmseqs2
is a software suite to search and cluster huge protein and nucleotide sequence sets.
Versions
13.45111
14.7e284
Commands
mmseqs
Module
You can load the modules by:
module load biocontainers
module load mmseqs2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mmseqs2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mmseqs2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mmseqs2
mmseqs createdb examples/DB.fasta targetDB
mmseqs createtaxdb targetDB tmp
mmseqs createindex targetDB tmp
mmseqs easy-taxonomy examples/QUERY.fasta targetDB alnRes tmp
Mob_suite
Introduction
MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies.
Versions
3.0.3
Commands
mob_cluster
mob_init
mob_recon
mob_typer
Module
You can load the modules by:
module load biocontainers
module load mob_suite
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mob_suite on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mob_suite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mob_suite
Modbam2bed
Introduction
Modbam2bed is a program to aggregate modified base counts stored in a modified-base BAM file to a bedMethyl file.
Versions
0.9.1
Commands
modbam2bed
Module
You can load the modules by:
module load biocontainers
module load modbam2bed
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run modbam2bed on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=modbam2bed
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers modbam2bed
Modeltest-ng
Introduction
ModelTest-NG is a tool for selecting the best-fit model of evolution for DNA and protein alignments. ModelTest-NG supersedes jModelTest and ProtTest in one single tool, with graphical and command console interfaces.
Versions
0.1.7
Commands
modeltest-ng
modeltest-ng-mpi
Module
You can load the modules by:
module load biocontainers
module load modeltest-ng
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run modeltest-ng on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=modeltest-ng
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers modeltest-ng
Momi
Introduction
momi (MOran Models for Inference) is a Python package that computes the expected sample frequency spectrum (SFS), a statistic commonly used in population genetics, and uses it to fit demographic history.
Versions
2.1.19
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load momi
Interactive job
To run momi interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers momi
(base) UserID@bell-a008:~ $ python
Python 3.9.7 (default, Sep 16 2021, 13:09:58)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import momi
>>> import logging
>>> logging.basicConfig(level=logging.INFO,
filename="tutorial.log")
>>> model = momi.DemographicModel(N_e=1.2e4, gen_time=29,
muts_per_gen=1.25e-8)
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run momi on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=momi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers momi
python python.py
Mothur
Introduction
Mothur
is an open source software package for bioinformatics data processing. The package is frequently used in the analysis of DNA from uncultured microbes.
Detailed information about Mothur can be found here: https://mothur.org
Versions
1.46.0
1.47.0
1.48.0
Commands
mothur
Module
You can load the modules by:
module load biocontainers
module load mothur/1.47.0
Interactive job
To run mothur
interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers mothur/1.47.0
(base) UserID@bell-a008:~ $ mothur
Linux version
Using ReadLine,Boost,HDF5,GSL
mothur v.1.47.0
Last updated: 1/21/22
by
Patrick D. Schloss
Department of Microbiology & Immunology
University of Michigan
http://www.mothur.org
When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.
Distributed under the GNU General Public License
Type 'help()' for information on the commands that are available
For questions and analysis support, please visit our forum at https://forum.mothur.org
Type 'quit()' to exit program
[NOTE]: Setting random seed to 19760620.
Interactive Mode
mothur > align.seqs(help)
mothur > quit()
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=mothur
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mothur/1.47.0
mothur batch_file
Motus
Introduction
The mOTU profiler is a computational tool that estimates relative taxonomic abundance of known and currently unknown microbial community members using metagenomic shotgun sequencing data.
Versions
3.0.3
Commands
motus
Module
You can load the modules by:
module load biocontainers
module load motus
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run motus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=motus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers motus
MrBayes
Introduction
MrBayes
is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters.
MrBayes is available both in a serial version (‘mb’) and in a parallel version (‘mb-mpi’) that uses MPI instructions to distribute computations across several processors or processor cores. The serial version does not support multi-threading, which means that you will not be able to utilize more than one core on a multi-core machine for a single MrBayes analysis. If you want to utilize all cores,you need to run the MPI version of MrBayes.
Note: ‘mb-mpi’ in this version of the container does not run across multiple nodes (only within a node). This is a bug in the container (upstream).
Versions
3.2.7
Commands
mb
mb-mpi
mpirun
mpiexec
Module
You can load the modules by:
module load biocontainers
module load mrbayes
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MrBayes on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mrbayes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mrbayes
Multiqc
Introduction
Multiqc
is a reporting tool that parses summary statistics from results and log files generated by other bioinformatics tools.
Versions
1.11
Commands
multiqc
Module
You can load the modules by:
module load biocontainers
module load multiqc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Multiqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=multiqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers multiqc
multiqc fastqc_out -o multiqc_out
Mummer4
Introduction
Mummer4
is a versatile alignment tool for DNA and protein sequences.
Versions
4.0.0rc1-pl5262
Commands
annotate
combineMUMs
delta-filter
delta2vcf
dnadiff
exact-tandems
mummer
mummerplot
nucmer
promer
repeat-match
show-aligns
show-coords
show-diff
show-snps
show-tiling
Module
You can load the modules by:
module load biocontainers
module load mummer4
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mummer4 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mummer4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mummer4
mummer -mum -b -c H_pylori26695_Eslice.fasta H_pyloriJ99_Eslice.fasta > mummer.mums
Muscle
Introduction
Muscle
is a modified progressive alignment algorithm which has comparable accuracy to MAFFT, but faster performance.
Versions
3.8.1551
5.1
Versions
3.8.1551
5.1
Commands
muscle
Module
You can load the modules by:
module load biocontainers
module load muscle
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Muscle on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=muscle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers muscle
muscle -align seqs2.fasta -output seqs.afa
Mutmap
Introduction
MutMap is a powerful and efficient method to identify agronomically important loci in crop plants.
Versions
2.3.3
Commands
mutmap
mutplot
Module
You can load the modules by:
module load biocontainers
module load mutmap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mutmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mutmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mutmap
Mykrobe
Introduction
Mykrobe analyses the whole genome of a bacterial sample, all within a couple of minutes, and predicts which drugs the infection is resistant to.
Versions
0.11.0
Commands
mykrobe
Module
You can load the modules by:
module load biocontainers
module load mykrobe
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mykrobe on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mykrobe
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mykrobe
N50
Introduction
N50 is a command line tool to calculate assembly metrices.
Versions
1.5.6
Commands
n50
Module
You can load the modules by:
module load biocontainers
module load n50
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run n50 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=n50
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers n50
Nanofilt
Introduction
Nanofilt
is a tool for filtering and trimming of Oxford Nanopore Sequencing data.
Versions
2.8.0
Commands
NanoFilt
Module
You can load the modules by:
module load biocontainers
module load nanofilt
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nanofilt on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nanofilt
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nanofilt
NanoFilt -q 12 --headcrop 75 reads.fastq | gzip > trimmed-reads.fastq.gz
Nanolyse
Introduction
Nanolyse
is a tool to remove reads mapping to the lambda phage genome from a fastq file.
Versions
1.2.0
Commands
NanoLyse
Module
You can load the modules by:
module load biocontainers
module load nanolyse
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nanolyse on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nanolyse
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nanolyse
gunzip -c reads.fastq.gz | NanoLyse | gzip > reads_without_lambda.fastq.gz
Nanoplot
Introduction
Nanoplot
is a plotting tool for long read sequencing data and alignments.
Versions
1.39.0
Commands
NanoPlot
Module
You can load the modules by:
module load biocontainers
module load nanoplot
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nanoplot on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=nanoplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nanoplot
NanoPlot --summary sequencing_summary.txt --loglength -o summary-plots-log-transformed
NanoPlot -t 2 --fastq reads1.fastq.gz reads2.fastq.gz --maxlength 40000 --plots dot --legacy hex
NanoPlot -t 12 --color yellow --bam alignment1.bam alignment2.bam alignment3.bam --downsample 10000 -o bamplots_downsampled
Nanopolish
Introduction
Nanopolish
is a software package for signal-level analysis of Oxford Nanopore sequencing data.
Versions
0.13.2
0.14.0
Commands
nanopolish
Module
You can load the modules by:
module load biocontainers
module load nanopolish
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nanopolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nanopolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nanopolish
nanopolish index -d fast5_files/ reads.fasta
nanopolish variants --consensus \
-o polished.vcf -w "tig00000001:200000-202000" \
-r reads.fasta -b reads.sorted.bam -g draft.fa
Ncbi-amrfinderplus
Introduction
Ncbi-amrfinderplus
and the accompanying database identify acquired antimicrobial resistance genes in bacterial protein and/or assembled nucleotide sequences as well as known resistance-associated point mutations for several taxa.
Versions
3.10.30
3.10.42
3.11.2
Commands
amrfinder
Module
You can load the modules by:
module load biocontainers
module load ncbi-amrfinderplus
Note
AMRFinderPlus database has been setup for users. Users can check the database version by amrfinder -V
. RCAC will keep updating database for users. If you notice our database is out of date, you can contact us to update the database.
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ncbi-amrfinderplus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ncbi-amrfinderplus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ncbi-amrfinderplus
# Protein AMRFinder with no genomic coordinates
amrfinder -p test_prot.fa
# Translated nucleotide AMRFinder (will not use HMMs)
amrfinder -n test_dna.fa
# Protein AMRFinder using GFF to get genomic coordinates and 'plus' genes
amrfinder -p test_prot.fa -g test_prot.gff --plus
# Protein AMRFinder with Escherichia protein point mutations
amrfinder -p test_prot.fa -O Escherichia
# Full AMRFinderPlus search combining results
amrfinder -p test_prot.fa -g test_prot.gff -n test_dna.fa -O Escherichia --plus
Ncbi-datasets
Introduction
NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. You can use it to find and download sequence, annotation, and metadata for genes and genomes using our command-line interface (CLI) tools or NCBI Datasets web interface.
Versions
14.3.0
Commands
datasets
dataformat
Module
You can load the modules by:
module load biocontainers
module load ncbi-datasets
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ncbi-datasets on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ncbi-datasets
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ncbi-datasets
Ncbi-genome-download
Introduction
Ncbi-genome-download
is a script to download genomes from the NCBI FTP servers.
Versions
0.3.1
Commands
ncbi-genome-download
Module
You can load the modules by:
module load biocontainers
module load ncbi-genome-download
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ncbi-genome-download on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=ncbi-genome-download
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ncbi-genome-download
ncbi-genome-download bacteria,viral --parallel 4
ncbi-genome-download --genera "Streptomyces coelicolor,Escherichia coli" bacteria
ncbi-genome-download --species-taxids 562 bacteria
Ncbi-table2asn
Introduction
table2asn is a command-line program that creates sequence records for submission to GenBank. It uses many of the same functions as Genome Workbench but is driven generally by data files, and the records it produces do not necessarily require additional manual editing before submission to GenBank.
Versions
1.26.678
Commands
table2asn
Module
You can load the modules by:
module load biocontainers
module load ncbi-table2asn
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ncbi-table2asn on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ncbi-table2asn
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ncbi-table2asn
Neusomatic
Introduction
NeuSomatic is based on deep convolutional neural networks for accurate somatic mutation detection. With properly trained models, it can robustly perform across sequencing platforms, strategies, and conditions. NeuSomatic summarizes and augments sequence alignments in a novel way and incorporates multi-dimensional features to capture variant signals effectively. It is not only a universal but also accurate somatic mutation detection method.
Versions
0.2.1
Commands
call.py
dataloader.py
extract_postprocess_targets.py
filter_candidates.py
generate_dataset.py
long_read_indelrealign.py
merge_post_vcfs.py
merge_tsvs.py
network.py
postprocess.py
preprocess.py
resolve_scores.py
resolve_variants.py
scan_alignments.py
split_bed.py
train.py
utils.py
Module
You can load the modules by:
module load biocontainers
module load neusomatic
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run neusomatic on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=neusomatic
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers neusomatic
Nextalign
Introduction
Nextalign
is a viral genome sequence alignment tool for command line.
Versions
1.10.3
Commands
nextalign
Module
You can load the modules by:
module load biocontainers
module load nextalign
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nextalign on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextalign
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextalign
nextalign \
--sequences data/sars-cov-2/sequences.fasta \
--reference data/sars-cov-2/reference.fasta \
--genemap data/sars-cov-2/genemap.gff \
--genes E,M,N,ORF1a,ORF1b,ORF3a,ORF6,ORF7a,ORF7b,ORF8,ORF9b,S \
--output-dir output/ \
--output-basename nextalign
Nextclade
Introduction
Nextclade
is a tool that identifies differences between your sequences and a reference sequence, uses these differences to assign your sequences to clades, and reports potential sequence quality issues in your data.
Versions
1.10.3
Commands
nextclade
Module
You can load the modules by:
module load biocontainers
module load nextclade
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nextclade on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextclade
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextclade
mkdir -p data
nextclade dataset get --name 'sars-cov-2' --output-dir 'data/sars-cov-2'
nextclade \
--in-order \
--input-fasta data/sars-cov-2/sequences.fasta \
--input-dataset data/sars-cov-2 \
--output-tsv output/nextclade.tsv \
--output-tree output/nextclade.auspice.json \
--output-dir output/ \
--output-basename nextclade
Nextdenovo
Introduction
NextDenovo is a string graph-based de novo assembler for long reads (CLR, HiFi and ONT). It uses a “correct-then-assemble” strategy similar to canu (no correction step for PacBio HiFi reads), but requires significantly less computing resources and storages. After assembly, the per-base accuracy is about 98-99.8%, to further improve single base accuracy, try NextPolish.
Versions
2.5.2
Commands
nextDenovo
Module
You can load the modules by:
module load biocontainers
module load nextdenovo
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run nextdenovo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextdenovo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextdenovo
Nextflow
Introduction
Nextflow
is a bioinformatics workflow manager that enables the development of portable and reproducible workflows.
Versions
21.10.0
Commands
nextflow
Module
You can load the modules by:
module load biocontainers
module load nextflow
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nextflow on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextflow
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextflow
Nextpolish
Introduction
NextPolish is used to fix base errors (SNV/Indel) in the genome generated by noisy long reads, it can be used with short read data only or long read data only or a combination of both. It contains two core modules, and use a stepwise fashion to correct the error bases in reference genome. To correct/assemble the raw third-generation sequencing (TGS) long reads with approximately 10-15% sequencing errors, please use NextDenovo.
Versions
1.4.1
Commands
nextPolish
Module
You can load the modules by:
module load biocontainers
module load nextpolish
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run nextpolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextpolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextpolish
Ngs-bits
Introduction
Ngs-bits
- Short-read sequencing tools.
Versions
2022_04
Commands
SampleAncestry
SampleDiff
SampleGender
SampleOverview
SampleSimilarity
SeqPurge
CnvHunter
RohHunter
UpdHunter
CfDnaQC
MappingQC
NGSDImportQC
ReadQC
SomaticQC
VariantQC
TrioMaternalContamination
BamCleanHaloplex
BamClipOverlap
BamDownsample
BamFilter
BamToFastq
BedAdd
BedAnnotateFreq
BedAnnotateFromBed
BedAnnotateGC
BedAnnotateGenes
BedChunk
BedCoverage
BedExtend
BedGeneOverlap
BedHighCoverage
BedInfo
BedIntersect
BedLiftOver
BedLowCoverage
BedMerge
BedReadCount
BedShrink
BedSort
BedSubtract
BedToFasta
BedpeAnnotateBreakpointDensity
BedpeAnnotateCnvOverlap
BedpeAnnotateCounts
BedpeAnnotateFromBed
BedpeFilter
BedpeGeneAnnotation
BedpeSort
BedpeToBed
FastqAddBarcode
FastqConcat
FastqConvert
FastqDownsample
FastqExtract
FastqExtractBarcode
FastqExtractUMI
FastqFormat
FastqList
FastqMidParser
FastqToFasta
FastqTrim
VcfAnnotateFromBed
VcfAnnotateFromBigWig
VcfAnnotateFromVcf
VcfBreakMulti
VcfCalculatePRS
VcfCheck
VcfExtractSamples
VcfFilter
VcfLeftNormalize
VcfSort
VcfStreamSort
VcfToBedpe
VcfToTsv
SvFilterAnnotations
NGSDExportGenes
GenePrioritization
GenesToApproved
GenesToBed
GraphStringDb
PhenotypeSubtree
PhenotypesToGenes
PERsim
FastaInfo
Module
You can load the modules by:
module load biocontainers
module load ngs-bits
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ngs-bits on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ngs-bits
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ngs-bits
SeqPurge -in1 input1_1.fastq input2_1.fastq \
-in2 input2_2.fastq input2_2.fastq \
-out1 R1.fastq.gz -out2 R2.fastq.gz
Ngsld
Introduction
ngsLD is a program to estimate pairwise linkage disequilibrium (LD) taking the uncertainty of genotype’s assignation into account. It does so by avoiding genotype calling and using genotype likelihoods or posterior probabilities.
Versions
1.1.1
Commands
ngsLD
Module
You can load the modules by:
module load biocontainers
module load ngsld
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ngsld on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ngsld
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ngsld
Ngsutils
Introduction
Ngsutils
is a suite of software tools for working with next-generation sequencing datasets.
Versions
0.5.9
Commands
ngsutils
bamutils
bedutils
fastqutils
gtfutils
Module
You can load the modules by:
module load biocontainers
module load ngsutils
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ngsutils on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ngsutils
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ngsutils
bamutils filter \
input.bam \
MQ10filtered.bam \
-mapped \
-noqcfail \
-gte MAPQ 10
bamutils stats \
-gtf genome.gtf MQ10filtered.bam \
> MQ10filtered_bamstats
Odgi
Introduction
odgi provides an efficient and succinct dynamic DNA sequence graph model, as well as a host of algorithms that allow the use of such graphs in bioinformatic analyses.
Versions
0.8.3
Commands
odgi
Module
You can load the modules by:
module load biocontainers
module load odgi
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run odgi on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=odgi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers odgi
OrthoFinder
Introduction
OrthoFinder
: phylogenetic orthology inference for comparative genomics
Detailed usage can be found here: https://github.com/davidemms/OrthoFinder
Versions
2.5.2
2.5.4
2.5.5
Commands
orthofinder
Module
You can load the modules by:
module load biocontainers
module load orthofinder/2.5.4
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run orthofinder on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=orthofinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers orthofinder/2.5.4
orthofinder -t 24 -f InputData -o output
Paml
Introduction
Paml
is a package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood.
Versions
4.9
Commands
baseml
basemlg
chi2
codeml
evolver
infinitesites
mcmctree
pamp
yn00
Module
You can load the modules by:
module load biocontainers
module load paml
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Paml on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=paml
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers paml
Panacota
Introduction
Panacota
is a software providing tools for large scale bacterial comparative genomics.
Versions
1.3.1
Commands
PanACoTA
Module
You can load the modules by:
module load biocontainers
module load panacota
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Panacota on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=panacota
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers panacota
PanACoTA annotate \
-d Examples/genomes_init \
-l Examples/input_files/list_genomes.lst \
-r Examples/2-res-QC -Q
Panaroo
Introduction
Panaroo is an updated pipeline for pangenome investigation.
Versions
1.2.10
Commands
panaroo
panaroo-extract-gene
panaroo-filter-pa
panaroo-fmg
panaroo-gene-neighbourhood
panaroo-img
panaroo-integrate
panaroo-merge
panaroo-msa
panaroo-plot-abundance
panaroo-qc
panaroo-spydrpick
Module
You can load the modules by:
module load biocontainers
module load panaroo
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run panaroo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=panaroo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers panaroo
panaroo -i gff/*.gff -o results --clean-mode strict
Pandaseq
Introduction
Pandaseq
is a program to align Illumina reads, optionally with PCR primers embedded in the sequence, and reconstruct an overlapping sequence.
Versions
2.11
Commands
pandaseq
Module
You can load the modules by:
module load biocontainers
module load pandaseq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pandaseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pandaseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pandaseq
pandaseq -f SRR069027_1.fastq -r SRR069027_2.fastq
Pandora
Introduction
Pandora is a tool for bacterial genome analysis using a pangenome reference graph (PanRG). It allows gene presence/absence detection and genotyping of SNPs, indels and longer variants in one or a number of samples.
Versions
0.9.1
Commands
pandora
Module
You can load the modules by:
module load biocontainers
module load pandora
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pandora on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=pandora
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pandora
pandora index -t 4 GC00006032.fa
Pangolin
Introduction
Pangolin
is a software package for assigning SARS-CoV-2 genome sequences to global lineages.
Versions
3.1.20
4.0.6
4.1.2
4.1.3
4.2
Commands
pangolin
Module
You can load the modules by:
module load biocontainers
module load pangolin
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pangolin on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pangolin
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pangolin
Pangraph
Introduction
Pangraph is a bioinformatic toolkit to align genome assemblies into pangenome graphs.
Versions
0.7.1
Commands
pangraph
Module
You can load the modules by:
module load biocontainers
module load pangraph
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pangraph on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pangraph
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pangraph
PanPhlAn
Introduction
PanPhlAn
(Pangenome-based Phylogenomic Analysis) is a strain-level metagenomic profiling tool for identifying the gene composition and in-vivo transcriptional activity of individual strains in metagenomic samples.
Versions
3.1
Commands
panphlan_download_pangenome.py
panphlan_map.py
panphlan_profiling.py
Module
You can load the modules by:
module load biocontainers
module load panphlan
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run PanPhlAn on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=panphlan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers panphlan
Clara Parabricks
Introduction
NVIDIA’s Clara Parabricks brings next generation sequencing to GPUs, accelerating an array of gold-standard tooling such as BWA-MEM, GATK4, Google’s DeepVariant, and many more. Users can achieve a 30-60x acceleration and 99.99% accuracy for variant calling when comparing against CPU-only BWA-GATK4 pipelines, meaning a single server can process up to 60 whole genomes per day. These tools can be easily integrated into current pipelines with drop-in replacement commands to quickly bring speed and data-center scale to a range of applications including germline, somatic and RNA workflows.
Versions
4.0.0-1
Commands
pbrun
Module
You can load the modules by:
module load biocontainers
module load parabricks
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
As Clara Parabricks depends on Nvidia GPU, it is only deployed in Scholar, Gilbreth, and ACCESS Anvil.
To run Clara Parabricks on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --gpus=1
#SBATCH --job-name=parabricks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers parabricks
pbrun haplotypecaller \
--ref FVZG01.1.fsa_nt \
--in-bam output.bam \
--out-variants variants.vcf
Parallel-fastq-dump
Introduction
Parallel-fastq-dump
is the parallel fastq-dump wrapper.
Versions
0.6.7
Commands
parallel-fastq-dump
Module
You can load the modules by:
module load biocontainers
module load parallel-fastq-dump
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Parallel-fastq-dump on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=parallel-fastq-dump
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers parallel-fastq-dump
parallel-fastq-dump -s SRR11941281/SRR11941281.sra \
--split-files --threads 4 --gzip
Parliament2
Introduction
Parliament2 identifies structural variants in a given sample relative to a reference genome. These structural variants cover large deletion events that are called as Deletions of a region, Insertions of a sequence into a region, Duplications of a region, Inversions of a region, or Translocations between two regions in the genome.
Versions
0.1.11
Commands
parliament2.py
Module
You can load the modules by:
module load biocontainers
module load parliament2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run parliament2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=parliament2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers parliament2
Parsnp
Introduction
Parsnp
is used to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours.
Versions
1.6.2
Commands
parsnp
Module
You can load the modules by:
module load biocontainers
module load parsnp
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Parsnp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=parsnp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers parsnp
parsnp -g examples/mers_virus/ref/England1.gbk \
-d examples/mers_virus/genomes/*.fna -c -p 8
Pasapipeline
Introduction
PASA, acronym for Program to Assemble Spliced Alignments (and pronounced ‘pass-uh’), is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. PASA also identifies and classifies all splicing variations supported by the transcript alignments.
Versions
2.5.2-devb
Commands
pasa
Launch_PASA_pipeline.pl
GMAP_multifasta_processor.pl
blat_to_btab.pl
blat_to_cdna_clusters.pl
blat_top_hit_extractor.pl
ensure_single_valid_alignment_per_cdna_per_cluster.pl
errors_to_newalign_btabs.pl
extract_FL_transdecoder_entries.pl
get_failed_transcripts.pl
gmap_to_btab.pl
import_GMAP_gff3.pl
pasa_alignment_assembler_textprocessor.pl
pasa_asmbls_to_training_set.extract_reference_orfs.pl
polyCistronAnalyzer.pl
process_BLAT_alignments.pl
process_GMAP_alignments_gff3_chimeras_ok.pl
process_PBLAT_alignments.pl
process_minimap2_alignments.pl
pslx_to_gff3.pl
run_spliced_aligners.pl
sim4_to_btab.pl
Annotation_store_preloader.dbi
Load_Current_Gene_Annotations.dbi
PASA_transcripts_and_assemblies_to_GFF3.dbi
UTR_category_analysis.dbi
__drop_many_mysql_dbs.dbi
alignment_assembly_to_gene_models.dbi
alt_splice_AAT_alignment_generator.dbi
assemble_clusters.dbi
assembly_db_loader.dbi
assign_clusters_by_gene_intergene_overlap.dbi
assign_clusters_by_stringent_alignment_overlap.dbi
build_comprehensive_transcriptome.dbi
build_comprehensive_transcriptome.tabix.dbi
cDNA_annotation_comparer.dbi
cDNA_annotation_updater.dbi
classify_alt_splice_as_UTR_or_protein.dbi
classify_alt_splice_isoforms.dbi
classify_alt_splice_isoforms_per_subcluster.dbi
comprehensive_alt_splice_report.dbi
compute_gene_coverage_by_incorporated_PASA_assemblies.dbi
create_mysql_cdnaassembly_db.dbi
create_sqlite_cdnaassembly_db.dbi
describe_alignment_assemblies.dbi
describe_alignment_assemblies_cgi_convert.dbi
drop_mysql_db_if_exists.dbi
dump_annot_store.dbi
dump_valid_annot_updates.dbi
extract_regions_for_probe_design.dbi
extract_skipped_exons.dbi
extract_transcript_alignment_clusters.dbi
find_FL_equivalent_support.dbi
find_alternate_internal_exons.dbi
get_antisense_transcripts.dbi
import_custom_alignments.dbi
import_spliced_alignments.dbi
invalidate_RNA-Seq_assembly_artifacts.dbi
invalidate_single_exon_ESTs.dbi
mapPolyAsites_to_genes.dbi
pasa_asmbl_genes_to_GFF3.dbi
pasa_asmbls_to_training_set.dbi
polyA_site_summarizer.dbi
polyA_site_transcript_mapper.dbi
populate_alignments_via_btab.dbi
populate_ath1_cdnas.dbi
populate_cdna_clusters.dbi
populate_mysql_assembly_alignment_field.dbi
populate_mysql_assembly_sequence_field.dbi
purge_PASA_database.dbi
purge_annot_comparisons.dbi
reassign_clusters_via_valid_align_coords.dbi
reconstruct_FL_isoforms_from_parts.dbi
report_alt_splicing_findings.dbi
reset_to_prior_to_assembly_build.dbi
retrieve_assembly_sequences.dbi
set_spliced_orient_transcribed_orient.dbi
splicing_events_in_subcluster_context.dbi
splicing_variation_to_splicing_event.dbi
subcluster_builder.dbi
subcluster_loader.dbi
test_assemble_clusters.dbi
test_mysql_connection.dbi
update_alignment_status.dbi
update_clusters_coordinates.dbi
update_fli_status.dbi
update_spliced_orient.dbi
upload_cdna_headers.dbi
upload_transcript_data.dbi
validate_alignments_in_db.dbi
Module
You can load the modules by:
module load biocontainers
module load pasapipeline
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pasapipeline on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pasapipeline
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pasapipeline
Pasta
Introduction
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.
Versions
1.8.7
Commands
run_pasta.py
run_seqtools.py
sumlabels.py
sumtrees.py
Module
You can load the modules by:
module load biocontainers
module load pasta
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pasta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pasta
Pblat
Introduction
pblat is parallelized blat with multi-threads support.
Versions
2.5.1
Commands
pblat
Module
You can load the modules by:
module load biocontainers
module load pblat
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pblat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pblat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pblat
Pbmm2
Introduction
Pbmm2
is a minimap2 frontend for PacBio native data formats.
Versions
1.7.0
Commands
pbmm2
Module
You can load the modules by:
module load biocontainers
module load pbmm2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pbmm2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=pbmm2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pbmm2
pbmm2 --version
pbmm2 align hg38.fa \
alz.polished.hq.bam alz.aligned.bam \
-j 12 --preset ISOSEQ --sort \
--log-level INFO
Pbptyper
Introduction
pbptyper is a tool to identify the Penicillin Binding Protein (PBP) of Streptococcus pneumoniae assemblies.
Versions
1.0.4
Commands
pbptyper
Module
You can load the modules by:
module load biocontainers
module load pbptyper
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pbptyper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pbptyper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pbptyper
pbptyper --assembly test/SRR2912551.fna.gz --outdir output
PCAngsd
Introduction
PCAngsd
is a program that estimates the covariance matrix and individual allele frequencies for low-depth next-generation sequencing (NGS) data in structured/heterogeneous populations using principal component analysis (PCA) to perform multiple population genetic analyses using genotype likelihoods.
Versions
1.10
Commands
pcangsd
Module
You can load the modules by:
module load biocontainers
module load pcangsd
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run PCAngsd on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=pcangsd
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pcangsd
pcangsd -b pupfish.beagle.gz --inbreedSites \
--selection -o pup_pca2 --threads 12
Peakranger
Introduction
Peakranger
is a multi-purporse software suite for analyzing next-generation sequencing (NGS) data.
Versions
1.18
Commands
peakranger
Module
You can load the modules by:
module load biocontainers
module load peakranger
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Peakranger on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=peakranger
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers peakranger
peakranger ccat --format bam 27-1_sorted_MDRD_MQ30filtered.bam 27-4_sorted_MDRD_MQ30filtered.bam \
ccat_result_with_HTML_report_5kb_region --report \
--gene_annot_file refGene.txt --plot_region 10000
Pepper_deepvariant
Introduction
PEPPER is a genome inference module based on recurrent neural networks that enables long-read variant calling and nanopore assembly polishing in the PEPPER-Margin-DeepVariant pipeline. This pipeline enables nanopore-based variant calling with DeepVariant.
Versions
r0.4.1
Commands
run_pepper_margin_deepvariant
Module
You can load the modules by:
module load biocontainers
module load pepper_deepvariant
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pepper_deepvariant on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 32
#SBATCH --job-name=pepper_deepvariant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pepper_deepvariant
BASE=$PWD
# Set up input data
INPUT_DIR="${BASE}/input/data"
REF="GRCh38_no_alt.chr20.fa"
BAM="HG002_ONT_2_GRCh38.chr20.quickstart.bam"
# Set the number of CPUs to use
THREADS=32
# Set up output directory
OUTPUT_DIR="${BASE}/output"
OUTPUT_PREFIX="HG002_ONT_2_GRCh38_PEPPER_Margin_DeepVariant.chr20"
OUTPUT_VCF="HG002_ONT_2_GRCh38_PEPPER_Margin_DeepVariant.chr20.vcf.gz"
TRUTH_VCF="HG002_GRCh38_1_22_v4.2.1_benchmark.quickstart.vcf.gz"
TRUTH_BED="HG002_GRCh38_1_22_v4.2.1_benchmark_noinconsistent.quickstart.bed"
# Create local directory structure
mkdir -p "${OUTPUT_DIR}"
mkdir -p "${INPUT_DIR}"
# Download the data to input directory
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_ONT_2_GRCh38.chr20.quickstart.bam
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_ONT_2_GRCh38.chr20.quickstart.bam.bai
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/GRCh38_no_alt.chr20.fa
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/GRCh38_no_alt.chr20.fa.fai
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_GRCh38_1_22_v4.2.1_benchmark.quickstart.vcf.gz
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_GRCh38_1_22_v4.2.1_benchmark_noinconsistent.quickstart.bed
run_pepper_margin_deepvariant call_variant \
-b input/data/HG002_ONT_2_GRCh38.chr20.quickstart.bam \
-f input/data/GRCh38_no_alt.chr20.fa -o output \
-p HG002_ONT_2_GRCh38_PEPPER_Margin_DeepVariant.chr20 \
-t 32 -r chr20:1000000-1020000 \
--ont_r9_guppy5_sup --ont
BioPerl
Introduction
BioPerl
is a collection of Perl modules that facilitate the development of Perl scripts for bioinformatics applications. It provides software modules for many of the typical tasks of bioinformatics programming.
Versions
1.7.2-pl526
Commands
SOAPsh.pl
ace.pl
bam2bedgraph
bamToGBrowse.pl
bdf2gdfont.pl
bdftogd
binhex.pl
bp_aacomp.pl
bp_biofetch_genbank_proxy.pl
bp_bioflat_index.pl
bp_biogetseq.pl
bp_blast2tree.pl
bp_bulk_load_gff.pl
bp_chaos_plot.pl
bp_classify_hits_kingdom.pl
bp_composite_LD.pl
bp_das_server.pl
bp_dbsplit.pl
bp_download_query_genbank.pl
bp_extract_feature_seq.pl
bp_fast_load_gff.pl
bp_fastam9_to_table.pl
bp_fetch.pl
bp_filter_search.pl
bp_find-blast-matches.pl
bp_flanks.pl
bp_gccalc.pl
bp_genbank2gff.pl
bp_genbank2gff3.pl
bp_generate_histogram.pl
bp_heterogeneity_test.pl
bp_hivq.pl
bp_hmmer_to_table.pl
bp_index.pl
bp_load_gff.pl
bp_local_taxonomydb_query.pl
bp_make_mrna_protein.pl
bp_mask_by_search.pl
bp_meta_gff.pl
bp_mrtrans.pl
bp_mutate.pl
bp_netinstall.pl
bp_nexus2nh.pl
bp_nrdb.pl
bp_oligo_count.pl
bp_pairwise_kaks
bp_parse_hmmsearch.pl
bp_process_gadfly.pl
bp_process_sgd.pl
bp_process_wormbase.pl
bp_query_entrez_taxa.pl
bp_remote_blast.pl
bp_revtrans-motif.pl
bp_search2alnblocks.pl
bp_search2gff.pl
bp_search2table.pl
bp_search2tribe.pl
bp_seq_length.pl
bp_seqconvert.pl
bp_seqcut.pl
bp_seqfeature_delete.pl
bp_seqfeature_gff3.pl
bp_seqfeature_load.pl
bp_seqpart.pl
bp_seqret.pl
bp_seqretsplit.pl
bp_split_seq.pl
bp_sreformat.pl
bp_taxid4species.pl
bp_taxonomy2tree.pl
bp_translate_seq.pl
bp_tree2pag.pl
bp_unflatten_seq.pl
ccconfig
chartex
chi2
chrom_sizes.pl
circo
clustalw
clustalw2
corelist
cpan
cpanm
dbilogstrip
dbiprof
dbiproxy
debinhex.pl
enc2xs
encguess
genomeCoverageBed.pl
h2ph
h2xs
htmltree
instmodsh
json_pp
json_xs
lwp-download
lwp-dump
lwp-mirror
lwp-request
perl
perl5.26.2
perlbug
perldoc
perlivp
perlthanks
piconv
pl2pm
pod2html
pod2man
pod2text
pod2usage
podchecker
podselect
prove
ptar
ptardiff
ptargrep
shasum
splain
stag-autoschema.pl
stag-db.pl
stag-diff.pl
stag-drawtree.pl
stag-filter.pl
stag-findsubtree.pl
stag-flatten.pl
stag-grep.pl
stag-handle.pl
stag-itext2simple.pl
stag-itext2sxpr.pl
stag-itext2xml.pl
stag-join.pl
stag-merge.pl
stag-mogrify.pl
stag-parse.pl
stag-query.pl
stag-splitter.pl
stag-view.pl
stag-xml2itext.pl
stubmaker.pl
t_coffee
tpage
ttree
unflatten
webtidy
xml_grep
xml_merge
xml_pp
xml_spellcheck
xml_split
xpath
xsubpp
zipdetails
Module
You can load the modules by:
module load biocontainers
module load perl-bioperl
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BioPerl on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=perl-bioperl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers perl-bioperl
Pggb
Introduction
pggb builds pangenome variation graphs from a set of input sequences.
Versions
0.5.4
Commands
pggb
Module
You can load the modules by:
module load biocontainers
module load pggb
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pggb on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pggb
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pggb
Phast
Introduction
PHAST is a freely available software package for comparative and evolutionary genomics. For more information, please check: BioContainers: https://biocontainers.pro/tools/phast Home page: http://compgen.cshl.edu/phast/
Versions
1.5
Commands
all_dists
base_evolve
chooseLines
clean_genes
consEntropy
convert_coords
display_rate_matrix
dless
dlessP
draw_tree
eval_predictions
exoniphy
hmm_train
hmm_tweak
hmm_view
indelFit
indelHistory
maf_parse
makeHKY
modFreqs
msa_diff
msa_split
msa_view
pbsDecode
pbsEncode
pbsScoreMatrix
pbsTrain
phast
phastBias
phastCons
phastMotif
phastOdds
phyloBoot
phyloFit
phyloP
prequel
refeature
stringiphy
treeGen
tree_doctor
Module
You can load the modules by:
module load biocontainers
module load phast
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phast on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phast
Phd2fasta
Introduction
Phd2fasta
is a tool to convert Phred ‘phd’ format files to ‘fasta’ format.
Versions
0.990622
Commands
phd2fasta
Module
You can load the modules by:
module load biocontainers
module load phd2fasta
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Phd2fasta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phd2fasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phd2fasta
Phg
Introduction
Practical Haplotype Graph (PHG) is a general, graph-based, computational framework that can be used with a variety of skim sequencing methods to infer high-density genotypes directly from low-coverage sequence.
Versions
1.0
Commands
CreateConsensi.sh
CreateHaplotypes.sh
CreateReferenceIntervals.sh
CreateSmallDataSet.sh
CreateValidIntervalsFile.sh
IndexPangenome.sh
LoadAssemblyAnchors.sh
LoadGenomeIntervals.sh
ParallelAssemblyAnchorsLoad.sh
RunLiquibaseUpdates.sh
CreateHaplotypesFromBAM.groovy
CreateHaplotypesFromFastq.groovy
CreateHaplotypesFromGVCF.groovy
Module
You can load the modules by:
module load biocontainers
module load phg
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phg on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phg
Phipack
Introduction
PhiPack: PHI test and other tests of recombination
Versions
1.1
Commands
Phi
Profile
Module
You can load the modules by:
module load biocontainers
module load phipack
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phipack on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phipack
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phipack
phrap
Introduction
phrap
is a program for assembling shotgun DNA sequence data.
Versions
1.090518
Commands
phrap
Module
You can load the modules by:
module load biocontainers
module load phrap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phrap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phrap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phrap
phred
Introduction
phred
software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base.
Versions
0.071220.c
Commands
phred
Module
You can load the modules by:
module load biocontainers
module load phred
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phred on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phred
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phred
Phylofisher
Introduction
PhyloFisher is a software package written in Python3 that can be used for the creation, analysis, and visualization of phylogenomic datasets that consist of eukaryotic protein sequences.
Versions
1.2.7
1.2.9
Commands
aa_comp_calculator.py
aa_recoder.py
apply_to_db.py
astral_runner.py
backup_restoration.py
bipartition_examiner.py
build_database.py
config.py
edirect.py
explore_database.py
fast_site_remover.py
fast_taxa_remover.py
fisher.py
forest.py
genetic_code_examiner.py
gfmix_runner.py
heterotachy.py
informant.py
install_deps.py
jp.py
mammal_modeler.py
matrix_constructor.py
prep_final_dataset.py
purge.py
random_resampler.py
rst2html.py
rst2html4.py
rst2html5.py
rst2latex.py
rst2man.py
rst2odt.py
rst2odt_prepstyles.py
rst2pseudoxml.py
rst2s5.py
rst2xetex.py
rst2xml.py
rstpep2html.py
rtc_binner.py
runxlrd.py
select_orthologs.py
select_taxa.py
sgt_constructor.py
taxon_collapser.py
vba_extract.py
windowmasker_2.2.22_adapter.py
working_dataset_constructor.py
Module
You can load the modules by:
module load biocontainers
module load phylofisher
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phylofisher on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phylofisher
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phylofisher
Phylosuite
Introduction
PhyloSuite is an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies.
Versions
1.2.3
Commands
PhyloSuite.sh
Module
You can load the modules by:
module load biocontainers
module load phylosuite
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phylosuite on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phylosuite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phylosuite
Picard Tools
Introduction
Picard
is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Detailed usage can be found here: https://broadinstitute.github.io/picard/
Versions
2.25.1
2.26.10
Commands
picard
Module
You can load the modules by:
module load biocontainers
module load picard/2.26.10
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run picard our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=picard
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers picard/2.26.10
picard MarkDuplicates -Xmx64g I=19P0126636WES_sorted.bam O=19P0126636WES_sorted_md.bam M=19P0126636WES.sorted.markdup.txt REMOVE_DUPLICATES=true
picard BuildBamIndex -Xmx64g I=19P0126636WES_sorted_md.bam
picard CreateSequenceDictionary -R hg38.fa -O hg38.dict
Picrust2
Introduction
Picrust2
is a software for predicting functional abundances based only on marker gene sequences.
Versions
2.4.2
2.5.0
Commands
add_descriptions.py
convert_table.py
hsp.py
metagenome_pipeline.py
pathway_pipeline.py
picrust2_pipeline.py
place_seqs.py
print_picrust2_config.py
run_abundance.py
run_sepp.py
run_tipp.py
run_tipp_tool.py
run_upp.py
shuffle_predictions.py
split_sequences.py
sumlabels.py
sumtrees.py
Module
You can load the modules by:
module load biocontainers
module load picrust2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Picrust2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 10
#SBATCH --job-name=picrust2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers picrust2
place_seqs.py -s ../seqs.fna -o out.tre -p 10 \
--intermediate intermediate/place_seqs
hsp.py -i 16S -t out.tre -o marker_predicted_and_nsti.tsv.gz -p 10 -n
hsp.py -i EC -t out.tre -o EC_predicted.tsv.gz -p 10
metagenome_pipeline.py -i ../table.biom -m marker_predicted_and_nsti.tsv.gz -f EC_predicted.tsv.gz -o EC_metagenome_out --strat_out
convert_table.py EC_metagenome_out/pred_metagenome_contrib.tsv.gz \
-c contrib_to_legacy \
-o EC_metagenome_out/pred_metagenome_contrib.legacy.tsv.gz
pathway_pipeline.py -i EC_metagenome_out/pred_metagenome_contrib.tsv.gz \
-o pathways_out -p 10
add_descriptions.py -i EC_metagenome_out/pred_metagenome_unstrat.tsv.gz -m EC \
-o EC_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz
add_descriptions.py -i pathways_out/path_abun_unstrat.tsv.gz -m METACYC \
-o pathways_out/path_abun_unstrat_descrip.tsv.gz
picrust2_pipeline.py -s chemerin_16S/seqs.fna -i chemerin_16S/table.biom \
-o picrust2_out_pipeline -p 10
Pilon
Introduction
Pilon
is an automated genome assembly improvement and variant detection tool.
Versions
1.24
Commands
pilon.jar
Module
You can load the modules by:
module load biocontainers
module load pilon
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pilon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=pilon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pilon
pilon.jar --nostrays \
--genome scaffolds.fasta \
--frags out_sorted.bam \
--vcf --verbose --threads 12 \
--output pilon_corrected \
--outdir pilon_outdir
Pindel
Introduction
Pindel
is used to detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data.
Versions
0.2.5b9
Commands
pindel
pindel2cvf
Module
You can load the modules by:
module load biocontainers
module load pindel
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pindel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pindel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pindel
pindel -i simulated_config.txt -f simulated_reference.fa -o bamtest -c ALL
pindel -p COLO-829_20-p_ok.txt -f hs_ref_chr20.fa -o colontumor -c 20
pindel2vcf -r hs_ref_chr20.fa -R HUMAN_G1K_V2 -d 20100101 -p colontumor_D -e 5
Pirate
Introduction
Pirate
is a pangenome analysis and threshold evaluation toolbox.
Versions
1.0.4
Commands
PIRATE
FET.pl
PIRATE_to_Rtab.pl
PIRATE_to_roary.pl
SOAPsh.pl
ace.pl
analyse_blast_outputs.pl
analyse_loci_list.pl
annotate_treeWAS_output.pl
bamToGBrowse.pl
bdf2gdfont.pl
binhex.pl
bp_aacomp.pl
bp_biofetch_genbank_proxy.pl
bp_bioflat_index.pl
bp_biogetseq.pl
bp_blast2tree.pl
bp_bulk_load_gff.pl
bp_chaos_plot.pl
bp_classify_hits_kingdom.pl
bp_composite_LD.pl
bp_das_server.pl
bp_dbsplit.pl
bp_download_query_genbank.pl
bp_extract_feature_seq.pl
bp_fast_load_gff.pl
bp_fastam9_to_table.pl
bp_fetch.pl
bp_filter_search.pl
bp_find-blast-matches.pl
bp_flanks.pl
bp_gccalc.pl
bp_genbank2gff.pl
bp_genbank2gff3.pl
bp_generate_histogram.pl
bp_heterogeneity_test.pl
bp_hivq.pl
bp_hmmer_to_table.pl
bp_index.pl
bp_load_gff.pl
bp_local_taxonomydb_query.pl
bp_make_mrna_protein.pl
bp_mask_by_search.pl
bp_meta_gff.pl
bp_mrtrans.pl
bp_mutate.pl
bp_netinstall.pl
bp_nexus2nh.pl
bp_nrdb.pl
bp_oligo_count.pl
bp_parse_hmmsearch.pl
bp_process_gadfly.pl
bp_process_sgd.pl
bp_process_wormbase.pl
bp_query_entrez_taxa.pl
bp_remote_blast.pl
bp_revtrans-motif.pl
bp_search2alnblocks.pl
bp_search2gff.pl
bp_search2table.pl
bp_search2tribe.pl
bp_seq_length.pl
bp_seqconvert.pl
bp_seqcut.pl
bp_seqfeature_delete.pl
bp_seqfeature_gff3.pl
bp_seqfeature_load.pl
bp_seqpart.pl
bp_seqret.pl
bp_seqretsplit.pl
bp_split_seq.pl
bp_sreformat.pl
bp_taxid4species.pl
bp_taxonomy2tree.pl
bp_translate_seq.pl
bp_tree2pag.pl
bp_unflatten_seq.pl
cd-hit-2d-para.pl
cd-hit-clstr_2_blm8.pl
cd-hit-div.pl
cd-hit-para.pl
chrom_sizes.pl
clstr2tree.pl
clstr2txt.pl
clstr2xml.pl
clstr_cut.pl
clstr_list.pl
clstr_list_sort.pl
clstr_merge.pl
clstr_merge_noorder.pl
clstr_quality_eval.pl
clstr_quality_eval_by_link.pl
clstr_reduce.pl
clstr_renumber.pl
clstr_rep.pl
clstr_reps_faa_rev.pl
clstr_rev.pl
clstr_select.pl
clstr_select_rep.pl
clstr_size_histogram.pl
clstr_size_stat.pl
clstr_sort_by.pl
clstr_sort_prot_by.pl
clstr_sql_tbl.pl
clstr_sql_tbl_sort.pl
convert_to_distmat.pl
convert_to_treeWAS.pl
debinhex.pl
genomeCoverageBed.pl
legacy_blast.pl
make_multi_seq.pl
pangenome_variants_to_treeWAS.pl
paralogs_to_Rtab.pl
plot_2d.pl
plot_len1.pl
stag-autoschema.pl
stag-db.pl
stag-diff.pl
stag-drawtree.pl
stag-filter.pl
stag-findsubtree.pl
stag-flatten.pl
stag-grep.pl
stag-handle.pl
stag-itext2simple.pl
stag-itext2sxpr.pl
stag-itext2xml.pl
stag-join.pl
stag-merge.pl
stag-mogrify.pl
stag-parse.pl
stag-query.pl
stag-splitter.pl
stag-view.pl
stag-xml2itext.pl
stubmaker.pl
subsample_outputs.pl
subset_alignments.pl
unique_sequences.pl
update_blastdb.pl
Module
You can load the modules by:
module load biocontainers
module load pirate
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pirate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pirate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pirate
Piscem
Introduction
piscem is a rust wrapper for a next-generation index + mapper tool (still currently written in C++17).
Versions
0.4.3
Commands
piscem
Module
You can load the modules by:
module load biocontainers
module load piscem
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run piscem on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=piscem
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers piscem
Pixy
Introduction
pixy is a command-line tool for painlessly estimating average nucleotide diversity within (π) and between (dxy) populations from a VCF.
Versions
1.2.7
Commands
pixy
Module
You can load the modules by:
module load biocontainers
module load pixy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pixy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pixy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pixy
Plasmidfinder
Introduction
PlasmidFinder identifies plasmids in total or partial sequenced isolates of bacteria.
Versions
2.1.6
Commands
plasmidfinder.py
Module
You can load the modules by:
module load biocontainers
module load plasmidfinder
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run plasmidfinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plasmidfinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers plasmidfinder
plasmidfinder.py -p test/database \
-i test/test.fsa -o output -mp blastn -x -q
Platon
Introduction
Platon: identification and characterization of bacterial plasmid contigs from short-read draft assemblies.
Versions
1.6
Commands
platon
Module
You can load the modules by:
module load biocontainers
module load platon
Note
The environment variable PLATON_DB
is set as /depot/itap/datasets/platon/db
. This directory contains the required database.
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run platon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=platon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers platon
platon --verbose --threads 4 contigs.fasta
Platypus
Introduction
Platypus
is a tool designed for efficient and accurate variant-detection in high-throughput sequencing data.
Versions
0.8.1
Commands
platypus
Module
You can load the modules by:
module load biocontainers
module load platypus
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Platypus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=platypus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers platypus
Plink
Introduction
Plink
is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.
Versions
1.90b6.21
Commands
plink
prettify
Module
You can load the modules by:
module load biocontainers
module load plink
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Plink on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plink
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers plink
plink --file toy --freq --out toy_analysis
Plink2
Introduction
Plink2
is a whole genome association analysis toolset.
Versions
2.00a2.3
Commands
plink2
Module
You can load the modules by:
module load biocontainers
module load plink2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Plink2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plink2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers plink2
plink2 --bfile HapMap_3_r3_1 --freq --out HapMap_3_r3_1_out
Plotsr
Introduction
Plotsr generates high-quality visualisation of synteny and structural rearrangements between multiple genomes. For this, it uses the genomic structural annotations between multiple chromosome-level assemblies.
Versions
0.5.4
Commands
plotsr
Module
You can load the modules by:
module load biocontainers
module load plotsr
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run plotsr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plotsr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers plotsr
plotsr syri.out refgenome qrygenome -H 8 -W 5
Pomoxis
Introduction
Pomoxis comprises a set of basic bioinformatic tools tailored to nanopore sequencing. Notably tools are included for generating and analysing draft assemblies. Many of these tools are used by the research data analysis group at Oxford Nanopore Technologies.
Versions
0.3.9
Commands
assess_assembly
catalogue_errors
common_errors_from_bam
coverage_from_bam
coverage_from_fastx
fast_convert
find_indels
intersect_assembly_errors
long_fastx
mini_align
mini_assemble
pomoxis_path
qscores_from_summary
ref_seqs_from_bam
reverse_bed
split_fastx
stats_from_bam
subsample_bam
summary_from_stats
tag_bam
trim_alignments
Module
You can load the modules by:
module load biocontainers
module load pomoxis
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pomoxis on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=pomoxis
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pomoxis
assess_assembly \
-i helen_output/Staph_Aur_draft_helen.fa \
-r truth_assembly_staph_aur.fasta \
-p polished_assembly_quality \
-l 50 \
-t 4 \
-e \
-T
Poppunk
Introduction
PopPUNK is a tool for clustering genomes. We refer to the clusters as variable-length-k-mer clusters, or VLKCs. Biologically, these clusters typically represent distinct strains. We refer to subclusters of strains as lineages.
Versions
2.5.0
2.6.0
Commands
poppunk
poppunk_add_weights.py
poppunk_assign
poppunk_batch_mst.py
poppunk_calculate_rand_indices.py
poppunk_calculate_silhouette.py
poppunk_easy_run.py
poppunk_extract_components.py
poppunk_extract_distances.py
poppunk_info
poppunk_iterate.py
poppunk_mandrake
poppunk_mst
poppunk_references
poppunk_visualise
Module
You can load the modules by:
module load biocontainers
module load poppunk
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run poppunk on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=poppunk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers poppunk
Popscle
Introduction
Popscle
is a suite of population scale analysis tools for single-cell genomics data.
Versions
0.1b
Commands
popscle
Module
You can load the modules by:
module load biocontainers
module load popscle
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Popscle on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=popscle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers popscle
popscle dsc-pileup --sam data/$bam --vcf data/$ref_vcf --out data/$pileup
Pplacer
Introduction
Pplacer places query sequences on a fixed reference phylogenetic tree to maximize phylogenetic likelihood or posterior probability according to a reference alignment, guppy does all of the downstream analysis of placements, and rppr does useful things having to do with reference packages. For more information, please check: BioContainers: https://biocontainers.pro/tools/pplacer Home page: https://matsen.fhcrc.org/pplacer/
Versions
1.1.alpha19
Commands
pplacer
guppy
rppr
Module
You can load the modules by:
module load biocontainers
module load pplacer
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pplacer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pplacer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pplacer
Prinseq
Introduction
Prinseq
is a tool that generates summary statistics of sequence and quality data and that is used to filter, reformat and trim next-generation sequence data.
Versions
0.20.4
Commands
prinseq-graphs-noPCA.pl
prinseq-graphs.pl
prinseq-lite.pl
Module
You can load the modules by:
module load biocontainers
module load prinseq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Prinseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=prinseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers prinseq
prinseq-lite.pl -verbose -fastq SRR5043021_1.fastq -fastq2 SRR5043021_2.fastq -graph_data test.gd -out_good null -out_bad null
prinseq-graphs.pl -i test.gd -png_all -o test
prinseq-graphs-noPCA.pl -i test.gd -png_all -o test_noPCA
Prodigal
Introduction
Prodigal
is a tool for fast, reliable protein-coding gene prediction for prokaryotic genome.
Versions
2.6.3
Commands
prodigal
Module
You can load the modules by:
module load biocontainers
module load prodigal
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Prodigal on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=prodigal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers prodigal
prodigal -i genome.fasta -o output.genes -a proteins.faa
Prokka
Introduction
Prokka
is a pipeline for rapidly annotating prokaryotic genomes. It produces GFF3, GBK and SQN files that are ready for editing in Sequin and ultimately submitted to Genbank/DDJB/ENA.
Detailed usage can be found here: https://github.com/tseemann/prokka
Versions
1.14.6
Commands
prokka
prokka-abricate_to_fasta_db
prokka-biocyc_to_fasta_db
prokka-build_kingdom_dbs
prokka-cdd_to_hmm
prokka-clusters_to_hmm
prokka-genbank_to_fasta_db
prokka-genpept_to_fasta_db
prokka-hamap_to_hmm
prokka-tigrfams_to_hmm
prokka-uniprot_to_fasta_db
Module
You can load the modules by:
module load biocontainers
module load prokka
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run prokka on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=prokka
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers prokka
prokka --compliant --centre UoN --outdir PRJEB12345 --locustag EHEC --prefix EHEC-Chr1 contigs.fa --cpus 24
prokka-genbank_to_fasta_db Coccus1.gbk Coccus2.gbk Coccus3.gbk Coccus4.gbk > Coccus.faa
Proteinortho
Introduction
Proteinortho
is a tool to detect orthologous genes within different species.
Versions
6.0.33
Commands
proteinortho
proteinortho2html.pl
proteinortho2tree.pl
proteinortho2xml.pl
proteinortho6.pl
proteinortho_cleanupblastgraph
proteinortho_clustering
proteinortho_compareProteinorthoGraphs.pl
proteinortho_do_mcl.pl
proteinortho_extract_from_graph.pl
proteinortho_ffadj_mcs.py
proteinortho_formatUsearch.pl
proteinortho_grab_proteins.pl
proteinortho_graphMinusRemovegraph
proteinortho_history.pl
proteinortho_singletons.pl
proteinortho_summary.pl
proteinortho_treeBuilderCore
Module
You can load the modules by:
module load biocontainers
module load proteinortho
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Proteinortho on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=proteinortho
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers proteinortho
proteinortho6.pl test/C.faa test/E.faa test/L.faa test/M.faa
ProtHint
Introduction
ProtHint
is a pipeline for predicting and scoring hints (in the form of introns, start and stop codons) in the genome of interest by mapping and spliced aligning predicted genes to a database of reference protein sequences.
Versions
2.6.0
Commands
cds_with_upstream_support.py
combine_gff_records.pl
count_cds_overlaps.py
flag_top_proteins.py
gff_from_region_to_contig.pl
make_chains.py
nucseq_for_selected_genes.pl
print_high_confidence.py
print_longest_isoform.py
proteins_from_gtf.pl
prothint.py
prothint2augustus.py
run_spliced_alignment.pl
run_spliced_alignment_pbs.pl
select_best_proteins.py
select_for_next_iteration.py
spalnBatch.sh
spaln_to_gff.py
Academic license
ProtHint depends on GenMark. To use GeneMark, users need to download license files by yourself.
Go to the GeneMark web site: http://exon.gatech.edu/GeneMark/license_download.cgi. Check the boxes for GeneMark-ES/ET/EP ver 4.69_lic
and LINUX 64
next to it, fill out the form, then click “I agree”. In the next page, right click and copy the link addresses for 64 bit
licenss. Paste the link addresses in the commands below:
cd $HOME
wget "replace with license URL"
zcat gm_key_64.gz > .gm_key
Module
You can load the modules by:
module load biocontainers
module load prothint
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ProtHint on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=prothint
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers prothint
prothint.py --threads 4 input/genome.fasta input/proteins.fasta --geneSeeds input/genemark.gtf --workdir test
Pullseq
Introduction
Pullseq is an utility program for extracting sequences from a fasta/fastq file.
Versions
1.0.2
Commands
pcre-config
pcregrep
pcretest
pullseq
seqdiff
Module
You can load the modules by:
module load biocontainers
module load pullseq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pullseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pullseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pullseq
Purge_dups
Introduction
purge_dups is designed to remove haplotigs and contig overlaps in a de novo assembly based on read depth.
Versions
1.2.6
Commands
augustify.py
bamToWig.py
cleanup-blastdb-volumes.py
edirect.py
executeTestCGP.py
extractAnno.py
findRepetitiveProtSeqs.py
fix_in_frame_stop_codon_genes.py
generate_plot.py
getAnnoFastaFromJoingenes.py
hist_plot.py
pd_config.py
run_abundance.py
run_purge_dups.py
run_sepp.py
run_tipp.py
run_tipp_tool.py
run_upp.py
split_sequences.py
stringtie2fa.py
sumlabels.py
sumtrees.py
Module
You can load the modules by:
module load biocontainers
module load purge_dups
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run purge_dups on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=purge_dups
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers purge_dups
Pvactools
Introduction
pVACtools is a cancer immunotherapy tools suite consisting of pVACseq, pVACbind, pVACfuse, pVACvector, and pVACview.
Versions
3.0.1
Commands
pvacbind
pvacfuse
pvacseq
pvactools
pvacvector
pvacview
Module
You can load the modules by:
module load biocontainers
module load pvactools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pvactools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pvactools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pvactools
pvacseq download_example_data .
pvacseq run \
pvacseq_example_data/input.vcf \
Test \
HLA-A*02:01,HLA-B*35:01,DRB1*11:01 \
MHCflurry MHCnuggetsI MHCnuggetsII NNalign NetMHC PickPocket SMM SMMPMBEC SMMalign \
pvacseq_output_data \
-e1 8,9,10 \
-e2 15 \
--iedb-install-directory /opt/iedb
Pyani
Introduction
Pyani
is an application and Python module for whole-genome classification of microbes using Average Nucleotide Identity.
Versions
0.2.11
0.2.12
Commands
average_nucleotide_identity.py
genbank_get_genomes_by_taxon.py
delta_filter_wrapper.py
Module
You can load the modules by:
module load biocontainers
module load pyani
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pyani on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyani
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyani
average_nucleotide_identity.py -i tests/ -o tests/test_ANIm_output -m ANIm -g
average_nucleotide_identity.py -i tests/ -o tests/test_ANIb_output -m ANIb -g
average_nucleotide_identity.py -i tests/ -o tests/test_ANIblastall_output -m ANIblastall -g
average_nucleotide_identity.py -i tests/ -o tests/test_TETRA_output -m TETRA -g
Pybedtools
Introduction
Pybedtools
wraps and extends BEDTools and offers feature-level manipulations from within Python.
Versions
0.9.0
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load pybedtools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pybedtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pybedtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pybedtools
Pybigwig
Introduction
Pybigwig
is a python extension, written in C, for quick access to bigBed files and access to and creation of bigWig files.
Versions
0.3.18
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load pybigwig
Interactive job
To run pybigwig interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers pybigwig
(base) UserID@bell-a008:~ $ python
Python 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 18:49:41)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyBigWig
>>> bw = pyBigWig.open("test/test.bw")
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run batch jobs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pybigwig
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pybigwig
python script.py
Pychopper
Introduction
Pychopper is a tool to identify, orient and trim full-length Nanopore cDNA reads. The tool is also able to rescue fused reads.
Versions
2.5.0
Commands
cdna_classifier.py
Module
You can load the modules by:
module load biocontainers
module load pychopper
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pychopper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pychopper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pychopper
Pycoqc
Introduction
Pycoqc
is a tool that computes metrics and generates interactive QC plots for Oxford Nanopore technologies sequencing data.
Versions
2.5.2
Commands
pycoQC
python
python3
Module
You can load the modules by:
module load biocontainers
module load pycoqc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pycoqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pycoqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pycoqc
pycoQC \
-f Albacore-1.2.1_basecall-1D-DNA_sequencing_summary.txt\
-o Albacore-1.2.1_basecall-1D-DNA.html \
--quiet
Pyensembl
Introduction
Pyensembl
is a Python interface to Ensembl reference genome metadata such as exons and transcripts.
Versions
1.9.4
Commands
pyensembl
python
python3
Module
You can load the modules by:
module load biocontainers
module load pyensembl
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pyensembl on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyensembl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyensembl
Pyfaidx
Introduction
Pyfaidx
is a Python package for random access and indexing of fasta files.
Versions
0.6.4
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load pyfaidx
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pyfaidx on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyfaidx
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyfaidx
Pygenometracks
Introduction
pyGenomeTracks aims to produce high-quality genome browser tracks that are highly customizable.
Versions
3.7
Commands
make_tracks_file
pyGenomeTracks
Module
You can load the modules by:
module load biocontainers
module load pygenometracks
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pygenometracks on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pygenometracks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pygenometracks
make_tracks_file --trackFiles domains.bed bigwig.bw -o tracks.ini
pyGenomeTracks --tracks tracks.ini \
--region chr2:10,000,000-11,000,000 --outFileName nice_image.pdf
Pygenomeviz
Introduction
pyGenomeViz is a genome visualization python package for comparative genomics implemented based on matplotlib.
Versions
0.2.2
0.3.2
Commands
pgv-download-dataset
pgv-mmseqs
pgv-mummer
pgv-pmauve
python
python3
Module
You can load the modules by:
module load biocontainers
module load pygenomeviz
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pygenomeviz on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pygenomeviz
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pygenomeviz
Pyranges
Introduction
Pyranges
are collections of intervals that support comparison operations (like overlap and intersect) and other methods that are useful for genomic analyses.
Versions
0.0.115
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load pyranges
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pyranges on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyranges
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyranges
Pysam
Introduction
Pysam
is a python module that makes it easy to read and manipulate mapped short read sequence data stored in SAM/BAM files.
Versions
0.18.0
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load pysam
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pysam on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pysam
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pysam
Pyvcf3
Introduction
PyVCF3 has been created because the Official PyVCF repository is no longer maintained and do not accept any pull requests.
Versions
1.0.3
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load pyvcf3
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pyvcf3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyvcf3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyvcf3
QIIME 2
Introduction
QIIME 2
is a is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results.
Versions
2021.2
2022.11
2022.2
2022.8
2023.2
2023.5
Commands
qiime
python
python3
Module
You can load the modules by:
module load biocontainers
module load qiime2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run QIIME 2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=qiime2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers qiime2
qiime metadata tabulate \
--m-input-file rep-seqs.qza \
--m-input-file taxonomy.qza \
--o-visualization tabulated-feature-metadata.qzv
Qtlseq
Introduction
Bulked segregant analysis, as implemented in QTL-seq (Takagi et al., 2013), is a powerful and efficient method to identify agronomically important loci in crop plants. QTL-seq was adapted from MutMap to identify quantitative trait loci. It utilizes sequences pooled from two segregating progeny populations with extreme opposite traits (e.g. resistant vs susceptible) and a single whole-genome resequencing of either of the parental cultivars.
Versions
2.2.3
Commands
qtlseq
Module
You can load the modules by:
module load biocontainers
module load qtlseq
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run qtlseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=qtlseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers qtlseq
Qualimap
Introduction
Qualimap
is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.
Versions
2.2.1
Commands
qualimap
Module
You can load the modules by:
module load biocontainers
module load qualimap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Qualimap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=qualimap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers qualimap
Quast
Introduction
Quast
is Quality Assessment Tool for Genome Assemblies.
Note: Running QUAST, please use the command: quast.py| metaquast.py fastafile [OTHER OPTIONS] DO NOT call it ‘python quast.py| metaquast.py’
Versions
5.0.2
5.2.0
Commands
quast.py
metaquast.py
Module
You can load the modules by:
module load biocontainers
module load quast
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Quast on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=quast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers quast
metaquast.py --gene-finding --threads 8 \
meta_contigs_1.fasta meta_contigs_2.fasta \
-r meta_ref_1.fasta,meta_ref_2.fasta,meta_ref_3.fasta \
-o quast_out_genefinding
QuickMIRSeq
Introduction
QuickMIRSeq
is an integrated pipeline for quick and accurate quantification of known miRNAs and isomiRs by jointly processing multiple samples.
Versions
1.0
Commands
perl
QuickMIRSeq-report.sh
Module
You can load the modules by:
module load biocontainers
module load quickmirseq
Note
This module defines program installation directory (note: inside the container!) as environment variable $QuickMIRSeq
. Once again, this is not a host path, this path is only available from inside the container.
With the way this module is organized, you should be able to use the variable freely for both the perl $QuickMIRSeq/QuickMIRSeq.pl allIDs.txt run.config
and the $QuickMIRSeq/QuickMIRSeq-report.sh
steps as directed by the user guide.
A simple QuickMIRSeq.pl
and QuickMIRSeq-report.sh
will also work (and can be a backup if the variable expansion somehow does not work for you).
You will also need a run configuration file. You can copy from an existing one, or take from the user guide, or as a last resort, use Singularity to copy the template (in $QuickMIRSeq/run.config.template
) from inside the container image. singularity shell
may be an easiest way for the latter.
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run QuickMIRSeq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=quickmirseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers quickmirseq
quickmerge -d out.rq.delta -q q.fasta -r scab8722.fasta -hco 5.0 -c 1.5 -l n -ml m -p prefix
R
Introduction
R
is a system for statistical computation and graphics.
This is a plain R-base installation (see https://github.com/rocker-org/rocker/) repackaged by RCAC with an addition of a handful prerequisite libraries (libcurl, libopenssl, libxml2, libcairo2 and libXt) and their header files.
Versions
4.1.1
Commands
R
Rscript
Module
You can load the modules by:
module load biocontainers
module load r
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run R on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=r
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r
Racon
Introduction
Racon
is a consensus module for raw de novo DNA assembly of long uncorrected reads.
Versions
1.4.20
1.5.0
Commands
racon
Module
You can load the modules by:
module load biocontainers
module load racon
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Racon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=racon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers racon
Ragout
Introduction
Ragout
is a tool for chromosome-level scaffolding using multiple references.
Versions
2.3
Commands
ragout
Module
You can load the modules by:
module load biocontainers
module load ragout
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ragout on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ragout
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ragout
Ragtag
Introduction
Ragtag
is a tool for fast reference-guided genome assembly scaffolding.
Versions
2.1.0
Commands
ragtag.py
Module
You can load the modules by:
module load biocontainers
module load ragtag
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ragtag on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ragtag
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ragtag
ragtag.py correct ref.fasta query.fasta
ragtag.py patch target.fa query.fa
Rapmap
Introduction
RapMap is a testing ground for ideas in quasi-mapping and selective alignment.
Versions
0.6.0
Commands
rapmap
Module
You can load the modules by:
module load biocontainers
module load rapmap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rapmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rapmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rapmap
Rasusa
Introduction
Rasusa: Randomly subsample sequencing reads to a specified coverage.
Versions
0.6.0
0.7.0
Commands
rasusa
Module
You can load the modules by:
module load biocontainers
module load rasusa
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rasusa on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rasusa
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rasusa
rasusa -i seq_1.fq -i seq_2.fq \
--coverage 100 --genome-size 35mb \
-o out.r1.fq -o out.r2.fq
Raven-assembler
Introduction
Raven-assembler
is a de novo genome assembler for long uncorrected reads.
Versions
1.8.1
Commands
raven
Module
You can load the modules by:
module load biocontainers
module load raven-assembler
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Raven-assembler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=raven-assembler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers raven-assembler
raven -t 12 input.fastq
Raxml
Introduction
Raxml
(Randomized Axelerated Maximum Likelihood) is a program for the Maximum Likelihood-based inference of large phylogenetic trees.
Versions
8.2.12
Commands
raxmlHPC
raxmlHPC-AVX2
raxmlHPC-PTHREADS
raxmlHPC-PTHREADS-AVX2
raxmlHPC-PTHREADS-SSE3
raxmlHPC-SSE3
Module
You can load the modules by:
module load biocontainers
module load raxml
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Raxml on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 36
#SBATCH --job-name=raxml
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers raxml
raxmlHPC-SSE3 -m GTRGAMMA -p 12345 -s input.fasta -n HPC-SSE3_out -# 20 -T 36
raxmlHPC -m GTRGAMMA -p 12345 -s input.fasta -n HPC_out -# 20 -T 36
raxmlHPC-AVX2 -m GTRGAMMA -p 12345 -s input.fasta -n HPC-AVX2_out -# 20 -T 36
raxmlHPC-PTHREADS -m GTRGAMMA -p 12345 -s input.fasta -n HPC-PTHREADS_out -# 20 -T 36
raxmlHPC-PTHREADS-AVX2 -m GTRGAMMA -p 12345 -s input.fasta -n HPC-PTHREADS-AVX2_out -# 20 -T 36
raxmlHPC-PTHREADS-SSE3 -m GTRGAMMA -p 12345 -s input.fasta -n HPC-PTHREADS-SSE3_out -# 20 -T 36
Raxml-ng
Introduction
Raxml-ng
is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion.
Versions
1.1.0
Commands
raxml-ng
raxml-ng-mpi
mpirun
mpiexec
Module
You can load the modules by:
module load biocontainers
module load raxml-ng
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Raxml-ng on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=raxml-ng
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers raxml-ng
raxml-ng --bootstrap --msa alignment.phy \
--model GTR+G --threads 12 --bs-trees 1000
R-cellchat
Introduction
CellChat: Inference and analysis of cell-cell communication.
Versions
1.5.0
Commands
R
Rscript
rstudio
Module
You can load the modules by:
module load biocontainers
module load r-cellchat
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run r-cellchat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=r-cellchat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r-cellchat
Reapr
Introduction
Reapr is a tool that evaluates the accuracy of a genome assembly using mapped paired end reads.
Notes provided by Neelam Jha
Reapr is a tool trying to find explicit errors in the assembly based on incongruently mapped reads. It is heavily based on too low span coverage, or reads mapping too far or too close to each other. The program will also break up contigs/scaffolds at spurious sites to form smaller (but hopefully correct) contigs. Reapr runs pretty slowly, sadly,
Reapr is a bit fuzzy with contig names, but luckily it’s given us a tool to check if things are ok before we proceed! The command reapr facheck <assembly.fasta>
will tell you if everything’s ok! in this case, no output is good output, since the only output from the command is the potential problems with the contig names. If you run into any problems, run reapr facheck <assembly.fasta> <renamed_assembly.fasta>
, and you will get an assembly file with renamed contigs.
Once the names are ok, we continue:
The first thing we reapr needs, is a list of all “perfect” reads. This is reads that have a perfect map to the reference. Reapr is finicky though, and can’t use libraries with different read lengths, so you’ll have to use assemblies based on the raw data for this. Run the command reapr perfectmap
to get information on how to create a perfect mapping file, and create a perfect mapping called <assembler>_perfect
.
The next tool we need is reapr smaltmap
which creates a bam file of read-pair mappings. Do the same thing you did with perfectmap and create an output file called <assembler>_smalt.bam
.
Finally we can use the smalt mapping, and the perfect mapping to run the reapr pipeline
. Run reapr pipeline
to get help on how to run, and then run the pipeline. Store the results in reapr_<assembler>
.
Versions
1.0.18
Commands
reapr
Module
You can load the modules by:
module load biocontainers
module load reapr
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run reapr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=reapr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers reapr
reapr facheck Assembly.fasta renamedAssembly.fasta
reapr perfectmap renamedAssembly.fasta reads_1.fastq reads_2.fastq 100 outputPrefix
reapr smaltmap renamedAssembly.fasta reads_1.fastq reads_2.fastq mapped.bam
reapr pipeline renamedAssembly.fasta mapped.bam pipeoutdir outputPrefix
Rebaler
Introduction
Rebaler
is a program for conducting reference-based assemblies using long reads.
Versions
0.2.0
Commands
rebaler
Module
You can load the modules by:
module load biocontainers
module load rebaler
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Rebaler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rebaler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rebaler
Reciprocal Smallest Distance
Introduction
The reciprocal smallest distance
(RSD) algorithm accurately infers orthologs between pairs of genomes by considering global sequence alignment and maximum likelihood evolutionary distance between sequences.
Versions
1.1.7
Commands
rsd_search
rsd_blast
rsd_format
Module
You can load the modules by:
module load biocontainers
module load reciprocal_smallest_distance
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Reciprocal Smallest Distance on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=reciprocal_smallest_distance
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers reciprocal_smallest_distance
rsd_search
-q Mycoplasma_genitalium.aa \
--subject-genome=Mycobacterium_leprae.aa \
-o Mycoplasma_genitalium.aa_Mycobacterium_leprae.aa_0.8_1e-5.orthologs.txt
rsd_format -g Mycoplasma_genitalium.aa
rsd_blast -v -q Mycoplasma_genitalium.aa \
--subject-genome=Mycobacterium_leprae.aa \
--forward-hits q_s.hits --reverse-hits s_q.hits \
--no-format --evalue 0.1
Recycler
Introduction
Recycler
is a tool designed for extracting circular sequences from de novo assembly graphs.
Versions
0.7
Commands
make_fasta_from_fastg.py
get_simple_cycs.py
recycle.py
Module
You can load the modules by:
module load biocontainers
module load recycler
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Recycler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=recycler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers recycler
recycle.py -g test/assembly_graph.fastg \
-k 55 -b test/test.sort.bam -i True
Regtools
Introduction
Regtools are tools that integrate DNA-seq and RNA-seq data to help interpret mutations in a regulatory and splicing context.
Versions
1.0.0
Commands
regtools
Module
You can load the modules by:
module load biocontainers
module load regtools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run regtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=regtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers regtools
RepeatMasker
Introduction
RepeatMakser
is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.
Detailed usage can be found here: http://www.repeatmasker.org.
Versions
4.1.2
Commands
RepeatMasker
Database
Note
As of May 20, 2019 GIRI has rescinded the working agreement allowing the www.repeatmasker.org website to offer a repeatmasking service utilizing the RepBase RepeatMasker Edition library. As a result, repeatmasker can only offer masking using the open database Dfam, which starting in 3.0 includes consensus sequences in addition to profile hidden Markov models for many transposable element families. Users requiring RepBase will need to purchase a commercial or academic license from GIRI and run RepeatMasker localy.
In our cluster, we set up the Dfam relaese 3.5 (October 2021) that include 285,580 repetitive DNA families.
Species name
Note
Since v4.1.1, RepeatMakser has switched to the FamDB format for the Dfam database. Due to this change, RepeatMasker becomes more strict with regards to what is acceptable for the -species
flag. The commonly used names such as “mammal” and “mouse” will not be accepted. To check for valid names, you can query the database using the python script famdb.py
(https://github.com/Dfam-consortium/FamDB).
See famdb.py --help
for usage information and below for an example the check the valid name for “mammal” using our copy of the Dfam database:
/depot/itap/datasets/Maker/RepeatMasker/Libraries/famdb.py -i /depot/itap/datasets/Maker/RepeatMasker/Libraries/Dfam.h5 names mammal
Module
You can load the modules by:
module load biocontainers
module load repeatmasker/4.1.2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RepeatMasker on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=repeatmsker
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers repeatmasker/4.1.2
RepeatMasker -pa 24 -species mammals genome.fasta
RepeatModeler
Introduction
RepeatModeler
is a de novo transposable element (TE) family identification and modeling package.
Versions
2.0.2
2.0.3
Commands
RepeatModeler
BuildDatabase
RepeatClassifier
Module
You can load the modules by:
module load biocontainers
module load repeatmodeler
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RepeatModeler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=repeatmodeler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers repeatmodeler
RepeatScout
Introduction
RepeatScout
is a tool to discover repetitive substrings in DNA.
Versions
1.0.6
Commands
RepeatScout
build_lmer_table
compare-out-to-gff.prl
filter-stage-1.prl
filter-stage-2.prl
merge-lmer-tables.prl
Module
You can load the modules by:
module load biocontainers
module load repeatscout
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RepeatScout on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=repeatscout
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers repeatscout
build_lmer_table -l 14 -sequence genome.fasta -freq Final_assembly.freq
RepeatScout -sequence genome.fasta -output Final_assembly_repeats.fasta -freq Final_assembly.freq -l 14
Resfinder
Introduction
ResFinder identifies acquired antimicrobial resistance genes in total or partial sequenced isolates of bacteria.
Versions
4.1.5
Commands
run_resfinder.py
run_batch_resfinder.py
Module
You can load the modules by:
module load biocontainers
module load resfinder
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run resfinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=resfinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers resfinder
run_resfinder.py -o output -db_res db_resfinder/ \
-db_res_kma db_resfinder/kma_indexing -db_point db_pointfinder/ \
-s "Escherichia coli" --acquired --point -ifq data/test_isolate_01_*
Revbayes
Introduction
RevBayes – Bayesian phylogenetic inference using probabilistic graphical models and an interactive language.
Versions
1.1.1
Commands
rb
rb-mpi
Module
You can load the modules by:
module load biocontainers
module load revbayes
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run revbayes on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=revbayes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers revbayes
rMATS
Introduction
MATS
is a computational tool to detect differential alternative splicing events from RNA-Seq data. The statistical model of MATS calculates the P-value and false discovery rate that the difference in the isoform ratio of a gene between two conditions exceeds a given user-defined threshold. From the RNA-Seq data, MATS can automatically detect and analyze alternative splicing events corresponding to all major types of alternative splicing patterns. MATS handles replicate RNA-Seq data from both paired and unpaired study design.
Detailed usage can be found here: http://rnaseq-mats.sourceforge.net
Versions
4.1.1
Commands
rmats.py
Module
You can load the modules by:
module load biocontainers
module load rmats
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rmats on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=rmats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rmats
rmats.py --b1 SR_b1.txt --b2 SR_b2.txt --gtf Homo_sapiens.GRCh38.105.gtf --od rmats_out_homo --tmp rmats_tmp -t paired --nthread 10 --readLength 150
rmats2sashimiplot
Introduction
rmats2sashimiplot
produces a sashimiplot visualization of rMATS output. rmats2sashimiplot can also produce plots using an annotation file and genomic coordinates. The plotting backend is MISO.
Detailed usage can be found here: https://github.com/Xinglab/rmats2sashimiplot
Versions
2.0.4
Commands
rmats2sashimiplot
Module
You can load the modules by:
module load biocontainers
module load rmats2sashimiplot
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rmats on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=rmats2sashimiplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rmats2sashimiplot
rmats2sashimiplot --s1 sample_1_replicate_1.sam,sample_1_replicate_2.sam,sample_1_replicate_3.sam \
--s2 sample_2_replicate_1.sam,sample_2_replicate_2.sam,sample_2_replicate_3.sam \
-t SE -e SE.MATS.JC.txt --l1 SampleOne --l2 SampleTwo --exon_s 1 --intron_s 5 \
-o test_events_output
RNAIndel
Introduction
RNAIndel
calls coding indels from tumor RNA-Seq data and classifies them as somatic, germline, and artifactual. RNAIndel supports GRCh38 and 37.
Versions
3.0.9
Commands
rnaindel
Module
You can load the modules by:
module load biocontainers
module load rnaindel
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RNAIndel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rnaindel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rnaindel
RNApeg
Introduction
RNApeg
is an RNA junction calling, correction, and quality-control package. RNAIndel supports GRCh38 and 37.
Versions
2.7.1
Commands
RNApeg.sh
Module
You can load the modules by:
module load biocontainers
module load rnapeg
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RNApeg on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rnapeg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rnapeg
Rnaquast
Introduction
Rnaquast
is a quality assessment tool for de novo transcriptome assemblies.
Versions
2.2.1
Commands
rnaQUAST.py
Dependencies de novo quality assessment and read alignment
Note
When reference genome and gene database are unavailable, users can also use BUSCO
and GeneMarkS-T
in rnaQUAST pipeline. Since GeneMarkS-T
requires the license key, users may need to download your own key, and put it in your $HOME.
rnaQUAST is also capable of calculating various statistics using raw reads (e.g. database coverage by reads). To use this, you will need use STAR
in the pipeline.
BUSCO
, GeneMarkS-T
, and STAR
have been installed, and the directories of their exectuables have been added to $PATH. Users do not need to load these modules. The only module required is rnaquast
itself.
Module
You can load the modules by:
module load biocontainers
module load rnaquast
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Rnaquast on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=rnaquast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rnaquast
rnaQUAST.py -t 12 -o output \
--transcripts Trinity.fasta idba.fasta \
--reference Saccharomyces_cerevisiae.R64-1-1.75.dna.toplevel.fa \
--gtf Saccharomyces_cerevisiae.R64-1-1.75.gtf
rnaQUAST.py -t 12 -o output2 \
--reference reference.fasta \
--transcripts transcripts.fasta \
--left_reads lef.fastq \
--right_reads right.fastq \
--busco fungi_odb10
Roary
Introduction
Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka) and calculates the pan genome.
Versions
3.13.0
Commands
roary
Module
You can load the modules by:
module load biocontainers
module load roary
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run roary on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=roary
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers roary
roary -f demo -e -n -v gff/*.gff
r-rnaseq
Introduction
r-rnaseq
is a customerized R module based on R/4.1.1
used for RNAseq analysis.
In the module, we have some packages installed:
BiocManager 1.30.16
ComplexHeatmap 2.9.4
DESeq2 1.34.0
edgeR 3.36.0
pheatmap 1.0.12
limma 3.48.3
tibble 3.1.5
tidyr 1.1.4
readr 2.0.2
readxl 1.3.1
purrr 0.3.4
dplyr 1.0.7
stringr 1.4.0
forcats 0.5.1
ggplot2 3.3.5
openxlsx 4.2.5
Versions
4.1.1-1
4.1.1-1-rstudio
Commands
R
Rscript
rstudio (only for the rstudio version)
Module
You can load the modules by:
module load biocontainers
module load r-rnaseq/4.1.1-1
# If you want to use Rstudio, load the rstudio version
module load r-rnaseq/4.1.1-1-rstudio
Install packages
Note
Users can also install packages they need. The installed location depends on the setting in your ~/.Rprofile
.
Detailed guide about installing R packages can be found here: https://www.rcac.purdue.edu/knowledge/bell/run/examples/apps/r/package.
Interactive job
To run interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers r-rnaseq/4.1.1-1 # or r-rnaseq/4.1.1-1-rstudio
(base) UserID@bell-a008:~ $ R
R version 4.1.1 (2021-08-10) -- "Kick Things"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(edgeR)
> library(pheatmap)
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=r_RNAseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r-rnaseq
Rscript RNAseq.R
RStudio
Introduction
RStudio
is an integrated development environment (IDE) for the R statistical computation and graphics system.
This is an RStudio IDE together with a plain R-base installation (see https://github.com/rocker-org/rocker/), repackaged by RCAC with an addition of a handful prerequisite libraries (libcurl, libopenssl, libxml2, libcairo2 and libXt) and their header files. It is intentionally separate from the biocontainers’ ‘r’ module for reasons of image size (700MB vs 360MB).
Versions
4.1.1
Commands
R
Rscript
rstudio
Module
You can load the modules by:
module load biocontainers
module load r-studio
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RStudio on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=r-studio
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r-studio
r-scrnaseq
Introduction
r-scrnaseq
is a customerized R module based on R/4.1.1
or R/4.2.0
used for scRNAseq analysis.
In the module, we have some packages installed:
BiocManager 1.30.16
CellChat 1.6.1
ProjecTILs 3.0
Seurat 4.1.0
SeuratObject 4.0.4
SeuratWrappers 0.3.0
monocle3 1.0.0
SnapATAC 1.0.0
SingleCellExperiment 1.14.1, 1.16.0
scDblFinder 1.8.0
SingleR 1.8.1
scCATCH 3.0
scMappR 1.0.7
rliger 1.0.0
schex 1.8.0
CoGAPS 3.14.0
celldex 1.4.0
dittoSeq 1.6.0
DropletUtils 1.14.2
miQC 1.2.0
Nebulosa 1.4.0
tricycle 1.2.0
pheatmap 1.0.12
limma 3.48.3, 3.50.0
tibble 3.1.5
tidyr 1.1.4
readr 2.0.2
readxl 1.3.1
purrr 0.3.4
dplyr 1.0.7
stringr 1.4.0
forcats 0.5.1
ggplot2 3.3.5
openxlsx 4.2.5
Versions
4.1.1-1
4.1.1-1-rstudio
4.2.0
4.2.0-rstudio
4.2.3-rstudio
Commands
R
Rscript
rstudio (only for the rstudio version)
Module
You can load the modules by:
module load biocontainers
module load r-scrnaseq
# or module load r-scrnaseq/4.2.0
# If you want to use Rstudio, load the rstudio version
module load r-scrnaseq/4.1.1-1-rstudio
# or module load r-scrnaseq/4.2.0-rstudio
Install packages
Note
Users can also install packages they need. The installed location depends on the setting in your ~/.Rprofile
.
Detailed guide about installing R packages can be found here: https://www.rcac.purdue.edu/knowledge/bell/run/examples/apps/r/package.
Interactive job
To run interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers r-scrnaseq/4.2.0 # or r-scrnaseq/4.2.0-rstudio
(base) UserID@bell-a008:~ $ R
R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(Seurat)
> library(monocle3)
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=r_scRNAseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r-scrnaseq
Rscript scRNAseq.R
RSEM
Introduction
RSEM
is a software package for estimating gene and isoform expression levels from RNA-Seq data. Further information can be found here: https://deweylab.github.io/RSEM/.
Versions
1.3.3
Commands
rsem-bam2readdepth
rsem-bam2wig
rsem-build-read-index
rsem-calculate-credibility-intervals
rsem-calculate-expression
rsem-control-fdr
rsem-extract-reference-transcripts
rsem-generate-data-matrix
rsem-generate-ngvector
rsem-gen-transcript-plots
rsem-get-unique
rsem-gff3-to-gtf
rsem-parse-alignments
rsem-plot-model
rsem-plot-transcript-wiggles
rsem-prepare-reference
rsem-preref
rsem-refseq-extract-primary-assembly
rsem-run-ebseq
rsem-run-em
rsem-run-gibbs
rsem-run-prsem-testing-procedure
rsem-sam-validator
rsem-scan-for-paired-end-reads
rsem-simulate-reads
rsem-synthesis-reference-transcripts
rsem-tbam2gbam
Dependencies
STAR v2.7.9a
, Bowtie v1.2.3
, Bowtie2 v2.3.5.1
, HISAT2 v2.2.1
were included in the container image. So users do not need to provide the dependency path in the RSEM parameter.
Module
You can load the modules by:
module load biocontainers
module load rsem/1.3.3
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RSEM on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=rsem
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rsem/1.3.3
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --bowtie Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_bowtie -p 24
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --bowtie2 Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_bowtie2 -p 24
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --hisat2-hca Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_hisat2 -p 24
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --star Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_star -p 24
rsem-calculate-expression --paired-end --star -p 24 SRR12095148_1.fastq SRR12095148_2.fastq Gh38_star SRR12095148_rsem_expression
Rseqc
Introduction
Rseqc
is a package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data.
Versions
4.0.0
Commands
FPKM-UQ.py
FPKM_count.py
RNA_fragment_size.py
RPKM_saturation.py
aggregate_scores_in_intervals.py
align_print_template.py
axt_extract_ranges.py
axt_to_fasta.py
axt_to_lav.py
axt_to_maf.py
bam2fq.py
bam2wig.py
bam_stat.py
bed_bigwig_profile.py
bed_build_windows.py
bed_complement.py
bed_count_by_interval.py
bed_count_overlapping.py
bed_coverage.py
bed_coverage_by_interval.py
bed_diff_basewise_summary.py
bed_extend_to.py
bed_intersect.py
bed_intersect_basewise.py
bed_merge_overlapping.py
bed_rand_intersect.py
bed_subtract_basewise.py
bnMapper.py
clipping_profile.py
deletion_profile.py
div_snp_table_chr.py
divide_bam.py
find_in_sorted_file.py
geneBody_coverage.py
geneBody_coverage2.py
gene_fourfold_sites.py
get_scores_in_intervals.py
infer_experiment.py
inner_distance.py
insertion_profile.py
int_seqs_to_char_strings.py
interval_count_intersections.py
interval_join.py
junction_annotation.py
junction_saturation.py
lav_to_axt.py
lav_to_maf.py
line_select.py
lzop_build_offset_table.py
mMK_bitset.py
maf_build_index.py
maf_chop.py
maf_chunk.py
maf_col_counts.py
maf_col_counts_all.py
maf_count.py
maf_covered_ranges.py
maf_covered_regions.py
maf_div_sites.py
maf_drop_overlapping.py
maf_extract_chrom_ranges.py
maf_extract_ranges.py
maf_extract_ranges_indexed.py
maf_filter.py
maf_filter_max_wc.py
maf_gap_frequency.py
maf_gc_content.py
maf_interval_alignibility.py
maf_limit_to_species.py
maf_mapping_word_frequency.py
maf_mask_cpg.py
maf_mean_length_ungapped_piece.py
maf_percent_columns_matching.py
maf_percent_identity.py
maf_print_chroms.py
maf_print_scores.py
maf_randomize.py
maf_region_coverage_by_src.py
maf_select.py
maf_shuffle_columns.py
maf_species_in_all_files.py
maf_split_by_src.py
maf_thread_for_species.py
maf_tile.py
maf_tile_2.py
maf_tile_2bit.py
maf_to_axt.py
maf_to_concat_fasta.py
maf_to_fasta.py
maf_to_int_seqs.py
maf_translate_chars.py
maf_truncate.py
maf_word_frequency.py
mask_quality.py
mismatch_profile.py
nib_chrom_intervals_to_fasta.py
nib_intervals_to_fasta.py
nib_length.py
normalize_bigwig.py
one_field_per_line.py
out_to_chain.py
overlay_bigwig.py
prefix_lines.py
pretty_table.py
qv_to_bqv.py
random_lines.py
read_GC.py
read_NVC.py
read_distribution.py
read_duplication.py
read_hexamer.py
read_quality.py
split_bam.py
split_paired_bam.py
table_add_column.py
table_filter.py
tfloc_summary.py
tin.py
ucsc_gene_table_to_intervals.py
wiggle_to_array_tree.py
wiggle_to_binned_array.py
wiggle_to_chr_binned_array.py
wiggle_to_simple.py
Module
You can load the modules by:
module load biocontainers
module load rseqc
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Rseqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rseqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rseqc
bam_stat.py -i *.bam -q 30
run-dbCAN
Introduction
run_dbCAN
using genomes/metagenomes/proteomes of any assembled organisms (prokaryotes, fungi, plants, animals, viruses) to search for CAZymes. This is a standalone tool of http://bcb.unl.edu/dbCAN2/. Details aobut its uage can be found in its Github repository.
Versions
3.0.2
3.0.6
Commands
run_dbcan
Database
Latest version of database has been downloaded and setup, including CAZyDB.09242021.fa, dbCAN-HMMdb-V10.txt, tcdb.fa, tf-1.hmm, tf-2.hmm, and stp.hmm.
Module
You can load the modules by:
module load biocontainers
module load run_dbcan/3.0.2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run run_dbcan on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=run_dbcan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers run_dbcan/3.0.2
run_dbcan protein.faa protein --out_dir test1_dbcan
run_dbcan genome.fasta prok --out_dir test2_dbcan
rush
Introduction
rush
is a tool similar to GNU parallel and gargs. rush borrows some idea from them and has some unique features, e.g., supporting custom defined variables, resuming multi-line commands, more advanced embeded replacement strings.
Versions
0.4.2
Commands
rush
Module
You can load the modules by:
module load biocontainers
module load rush
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rush on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rush
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rush
Sage
Introduction
Sage is a proteomics search engine - a tool that transforms raw mass spectra from proteomics experiments into peptide identificatons via database searching & spectral matching. But, it’s also more than just a search engine - Sage includes a variety of advanced features that make it a one-stop shop: retention time prediction, quantification (both isobaric & LFQ), peptide-spectrum match rescoring, and FDR control.
Versions
0.8.1
Commands
sage
Module
You can load the modules by:
module load biocontainers
module load sage
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run sage on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sage
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sage
Salmon
Introduction
Salmon
is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimates from RNA-seq data.
Detailed usage can be found here: https://github.com/COMBINE-lab/salmon
Versions
1.10.1
1.5.2
1.6.0
1.7.0
1.8.0
1.9.0
Commands
salmon index
salmon quant
salmon alevin
salmon swim
salmon quantmerge
Module
You can load the modules by:
module load biocontainers
module load salmon
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Salmon on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=salmon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers salmon
salmon index -t Homo_sapiens.GRCh38.cds.all.fa -i salmon_index
salmon quant -i salmon_index -l A -p 24 -1 SRR16956239_1.fastq -2 SRR16956239_2.fastq --validateMappings -o transcripts_quan
Sambamba
Introduction
Sambamba
is a high performance highly parallel robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files.
Versions
0.8.2
Commands
sambamba
Module
You can load the modules by:
module load biocontainers
module load sambamba
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sambamba on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sambamba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sambamba
sambamba view --reference-info input.bam
sambamba view -c -F "mapping_quality >= 40" input.bam
Samblaster
Introduction
Samblaster
is a tool to mark duplicates and extract discordant and split reads from sam files.
Versions
0.1.26
Commands
samblaster
Module
You can load the modules by:
module load biocontainers
module load samblaster
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Samblaster on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samblaster
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers samblaster
Samclip
Introduction
Samclip is a tool to filter SAM file for soft and hard clipped alignments.
Versions
0.4.0
Commands
samclip
Module
You can load the modules by:
module load biocontainers
module load samclip
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run samclip on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samclip
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers samclip
samclip --ref test.fna < test.sam > out.sam
Samplot
Introduction
Samplot
is a command line tool for rapid, multi-sample structural variant visualization.
Versions
1.3.0
Commands
samplot
Module
You can load the modules by:
module load biocontainers
module load samplot
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Samplot on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers samplot
samplot plot \
-n NA12878 NA12889 NA12890 \
-b samplot/test/data/NA12878_restricted.bam \
samplot/test/data/NA12889_restricted.bam \
samplot/test/data/NA12890_restricted.bam \
-o 4_115928726_115931880.png \
-c chr4 \
-s 115928726 \
-e 115931880 \
-t DEL
Samtools
Introduction
Samtools
is a set of utilities for the Sequence Alignment/Map (SAM) format.
Versions
1.15
1.16
1.17
1.9
Commands
samtools
ace2sam
htsfile
maq2sam-long
maq2sam-short
tabix
wgsim
Module
You can load the modules by:
module load biocontainers
module load samtools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Samtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers samtools
Scanpy
Introduction
Scanpy
is scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells. Details about its usage can be found here (https://scanpy.readthedocs.io/en/stable/)
Versions
1.8.2
1.9.1
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load scanpy/1.8.2
Interactive job
To run scanpy interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers scanpy/1.8.2
(base) UserID@bell-a008:~ $ python
Python 3.9.5 (default, Jun 4 2021, 12:28:51)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scanpy as sc
>>> sc.tl.umap(adata, **tool_params)
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=scanpy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scanpy/1.8.2
python script.py
Scarches
Introduction
scArches is a package to integrate newly produced single-cell datasets into integrated reference atlases.
Versions
0.5.3
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load scarches
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run scarches on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scarches
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scarches
Scgen
Introduction
scGen is a generative model to predict single-cell perturbation response across cell types, studies and species.
Versions
2.1.0
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load scgen
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run scgen on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scgen
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scgen
Scirpy
Introduction
Scirpy is a scalable python-toolkit to analyse T cell receptor (TCR) or B cell receptor (BCR) repertoires from single-cell RNA sequencing (scRNA-seq) data. It seamlessly integrates with the popular scanpy library and provides various modules for data import, analysis and visualization.
Versions
0.10.1
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load scirpy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run scirpy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scirpy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scirpy
scVelo
Introduction
scVelo
is a scalable toolkit for RNA velocity analysis in single cells, based on https://doi.org/10.1038/s41587-020-0591-3. Its detailed usage can be found here: https://scvelo.readthedocs.io.
Versions
0.2.4
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load scvelo/0.2.4
Interactive job
To run scVelo interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers scvelo/0.2.4
(base) UserID@bell-a008:~ $ python
Python 3.9.5 (default, Jun 4 2021, 12:28:51)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scvelo as scv
>>> scv.set_figure_params()
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=scvelo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scvelo/0.2.4
python script.py
Scvi-tools
Introduction
scvi-tools (single-cell variational inference tools) is a package for end-to-end analysis of single-cell omics data primarily developed and maintained by the Yosef Lab at UC Berkeley.
Versions
0.16.2
Commands
python
python3
R
Rscript
Module
You can load the modules by:
module load biocontainers
module load scvi-tools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run scvi-tools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scvi-tools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scvi-tools
Segalign
Introduction
Segalign is a scalable GPU system for pairwise whole genome alignments based on LASTZ’s seed-filter-extend paradigm.
Versions
0.1.2
Commands
faToTwoBit
run_segalign
run_segalign_repeat_masker
segalign
segalign_repeat_masker
twoBitToFa
Module
You can load the modules by:
module load biocontainers
module load segalign
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run segalign on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=segalign
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers segalign
Seidr
Introduction
Seidr
is a community gene network inference and exploration toolkit.
Versions
0.14.2
Commands
correlation
seidr
mi
pcor
narromi
plsnet
llr-ensemble
svm-ensemble
genie3
tigress
el-ensemble
makeconv
genrb
gencfu
gencnval
gendict
tomsimilarity
Module
You can load the modules by:
module load biocontainers
module load seidr
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Seidr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seidr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers seidr
Sepp
Introduction
Sepp
stands for SATé-Enabled Phylogenetic Placement and addresses the problem of phylogenetic placement for meta-genomic short reads.
Versions
4.5.1
Commands
run_sepp.py
run_upp.py
split_sequences.py
sumlabels.py
sumtrees.py
Module
You can load the modules by:
module load biocontainers
module load sepp
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sepp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sepp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sepp
run_sepp.py -t mock/rpsS/sate.tre \
-r mock/rpsS/sate.tre.RAxML_info \
-a mock/rpsS/sate.fasta \
-f mock/rpsS/rpsS.even.fas \
-o rpsS.out.default
Seqcode
Introduction
SeqCode is a family of applications designed to develop high-quality images and perform genome-wide calculations from high-throughput sequencing experiments. This software is presented into two distinct modes: web tools and command line. The website of SeqCode offers most functions to users with no previous expertise in bioinformatics, including operations on a selection of published ChIP-seq samples and applications to generate multiple classes of graphics from data files of the user. On the contrary, the standalone version of SeqCode allows bioinformaticians to run each command on any type of sequencing data locally in their computer. The architecture of the source code is modular and the input/output interface of the commands is suitable to be integrated into existing pipelines of genome analysis. SeqCode has been written in ANSI C, which favors the compatibility in every UNIX platform and grants a high performance and speed when analyzing sequencing data. Meta-plots, heatmaps, boxplots and the rest of images produced by SeqCode are internally generated using R. SeqCode relies on the RefSeq reference annotations and is able to deal with the genome and assembly release of every organism that is available from this consortium.
Versions
1.0
Commands
buildChIPprofile
combineChIPprofiles
combineTSSmaps
combineTSSplots
computemaxsignal
findPeaks
genomeDistribution
matchpeaks
matchpeaksgenes
processmacs
produceGENEmaps
produceGENEplots
producePEAKmaps
producePEAKplots
produceTESmaps
produceTESplots
produceTSSmaps
produceTSSplots
recoverChIPlevels
scorePhastCons
Module
You can load the modules by:
module load biocontainers
module load seqcode
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run seqcode on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seqcode
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers seqcode
buildChIPprofile -vd ChromInfo.txt \
H3K4me3_sample.bam test_buildChIPprofile
Seqkit
Introduction
Seqkit
is a rapid tool for manipulating fasta and fastq files.
Versions
2.0.0
2.1.0
2.3.1
Commands
seqkit
Module
You can load the modules by:
module load biocontainers
module load seqkit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Seqkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seqkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers seqkit
seqkit stats configs.fasta > contigs_statistics.txt
Seqyclean
Introduction
Seqyclean is used to pre-process NGS data in order to prepare for downstream analysis. For more information, please check: Docker hub: https://hub.docker.com/r/staphb/seqyclean Home page: https://github.com/ibest/seqyclean
Versions
1.10.09
Commands
seqyclean
Module
You can load the modules by:
module load biocontainers
module load seqyclean
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run seqyclean on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seqyclean
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers seqyclean
Shapeit4
Introduction
SHAPEIT4 is a fast and accurate method for estimation of haplotypes (aka phasing) for SNP array and high coverage sequencing data.
Versions
4.2.2
Commands
shapeit4
Module
You can load the modules by:
module load biocontainers
module load shapeit4
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shapeit4 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shapeit4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shapeit4
Shapeit5
Introduction
SHAPEIT5 is a software package to estimate haplotypes in large genotype datasets (WGS and SNP array).
Versions
5.1.1
Commands
phase_common
ligate
phase_rare
simulate
switch
xcftools
Module
You can load the modules by:
module load biocontainers
module load shapeit5
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shapeit5 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shapeit5
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shapeit5
Shasta
Introduction
Shasta is a software for de novo assembly from Oxford Nanopore reads.
Versions
0.10.0
Commands
shasta
Module
You can load the modules by:
module load biocontainers
module load shasta
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shasta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shasta
shasta --input r94_ec_rad2.181119.60x-10kb.fasta \
--config Nanopore-May2022
Shigeifinder
Introduction
Shigeifinder is a tool that is used to identify differentiate Shigella/EIEC using cluster-specific genes and identify the serotype using O-antigen/H-antigen genes. For more information, please check: Docker hub: https://hub.docker.com/r/staphb/shigeifinder Home page: https://github.com/LanLab/ShigEiFinder
Versions
1.3.2
Commands
shigeifinder
Module
You can load the modules by:
module load biocontainers
module load shigeifinder
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shigeifinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shigeifinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shigeifinder
Shorah
Introduction
Shorah
is an open source project for the analysis of next generation sequencing data.
Versions
1.99.2
Commands
shorah
b2w
diri_sampler
fil
Module
You can load the modules by:
module load biocontainers
module load shorah
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Shorah on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shorah
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shorah
shorah amplicon -b ampli_sorted.bam -f reference.fasta
shorah shotgun -b test_aln.cram -f test_ref.fasta
shorah shotgun -a 0.1 -w 42 -x 100000 -p 0.9 -c 0 -r REF:42-272 -R 42 -b test_aln.cram -f ref.fasta
Shortstack
Introduction
Shortstack
is a tool for comprehensive annotation and quantification of small RNA genes.
Versions
3.8.5
Commands
ShortStack
Module
You can load the modules by:
module load biocontainers
module load shortstack
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Shortstack on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shortstack
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shortstack
Shovill
Introduction
Shovill is a tool to assemble bacterial isolate genomes from Illumina paired-end reads.
Versions
1.1.0
Commands
shovill
Module
You can load the modules by:
module load biocontainers
module load shovill
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shovill on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shovill
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shovill
shovill --outdir out \
--R1 test/R1.fq.gz \
--R2 test/R2.fq.gz
Sicer
Introduction
Sicer
is a clustering approach for identification of enriched domains from histone modification ChIP-Seq data.
Versions
1.1
Commands
SICER-df-rb.sh
SICER-df.sh
SICER-rb.sh
SICER.sh
Module
You can load the modules by:
module load biocontainers
module load sicer
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sicer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sicer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sicer
SICER.sh ./ test.bed control.bed . hg18 1 200 150 0.74 600 .01
SICER-rb.sh ./ test.bed . hg18 1 200 150 0.74 400 100
Sicer2
Introduction
Sicer2
is the redesigned and improved ChIP-seq broad peak calling tool SICER.
Versions
1.0.3
1.2.0
Commands
sicer
sicer_df
recognicer
recognicer_df
Module
You can load the modules by:
module load biocontainers
module load sicer2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sicer2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sicer2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sicer2
sicer_df -t ./test/treatment_1.bed ./test/treatment_2.bed \
-c ./test/control_1.bed ./test/control_2.bed \
-s hg38 --significant_reads
recognicer_df -t ./test/treatment_1.bed ./test/treatment_2.bed \
-c ./test/control_1.bed ./test/control_2.bed \
-s hg38 --significant_reads
SignalP
Introduction
SignalP
predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes.
Versions
4.1
Commands
signalp
Module
You can load the modules by:
module load biocontainers
module load signalp
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run SignalP on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=signalp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers signalp
signalp -t gram+ -f all proka.fasta > proka_out
signalp -t euk -f all euk.fasta > euk.out
Signalp6
Introduction
SignalP predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes.
Versions
6.0-fast
6.0-slow
Commands
signalp6
Module
You can load the modules by:
module load biocontainers
module load signalp6
Example job for fast mode
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run signalp6 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=signalp6-fast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers signalp6/6.0-fast
signalp6 --write_procs 24 --fastafile proteins_clean.fasta \
--organism euk --output_dir output_fast \
--format txt --mode fast
Example job for slow mode
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run signalp6 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 12:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=signalp6-slow
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers signalp6/6.0-slow
signalp6 --write_procs 24 --fastafile proteins_clean.fasta \
--organism euk --output_dir output_slow \
--format txt --mode slow
signalp6 --write_procs 24 --fastafile proteins_clean.fasta \
--organism euk --output_dir output_slow-sequential \
--format txt --mode slow-sequential
Simug
Introduction
Simug
is a general-purpose genome simulator.
Versions
1.0.0
Commands
simuG
vcf2model
Module
You can load the modules by:
module load biocontainers
module load simug
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Simug on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=simug
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers simug
Singlem
Introduction
SingleM is a tool for profiling shotgun metagenomes. It has a particular strength in detecting microbial lineages which are not in reference databases. The method it uses also makes it suitable for some related tasks, such as assessing eukaryotic contamination, finding bias in genome recovery, computing ecological diversity metrics, and lineage-targeted MAG recovery.
Versions
0.13.2
Commands
singlem
Module
You can load the modules by:
module load biocontainers
module load singlem
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run singlem on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=singlem
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers singlem
Ska
Introduction
SKA (Split Kmer Analysis) is a toolkit for prokaryotic (and any other small, haploid) DNA sequence analysis using split kmers. A split kmer is a pair of kmers in a DNA sequence that are separated by a single base. Split kmers allow rapid comparison and alignment of small genomes, and is particulalry suited for surveillance or outbreak investigation. SKA can produce split kmer files from fasta format assemblies or directly from fastq format read sequences, cluster them, align them with or without a reference sequence and provide various comparison and summary statistics. Currently all testing has been carried out on high-quality Illumina read data, so results for other platforms may vary.
Versions
1.0
Commands
ska
Module
You can load the modules by:
module load biocontainers
module load ska
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ska on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ska
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ska
Skewer
Introduction
Skewer
is a fast and accurate adapter trimmer for paired-end reads.
Versions
0.2.2
Commands
skewer
Module
You can load the modules by:
module load biocontainers
module load skewer
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Skewer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=skewer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers skewer
skewer -l 50 -m pe -o skewerQ30 --mean-quality 30 \
--end-quality 30 -t 10 -x TruSeq3-PE.fa \
input_1.fastq input_2.fastq
Slamdunk
Introduction
Slamdunk is a novel, fully automated software tool for automated, robust, scalable and reproducible SLAMseq data analysis.
Versions
0.4.3
Commands
slamdunk
alleyoop
Module
You can load the modules by:
module load biocontainers
module load slamdunk
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run slamdunk on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=slamdunk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers slamdunk
Smoove
Introduction
Smoove
simplifies and speeds calling and genotyping SVs for short reads.
Versions
0.2.7
Commands
smoove
Module
You can load the modules by:
module load biocontainers
module load smoove
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Smoove on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=smoove
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers smoove
smoove call \
-x --name my-cohort \
--exclude hg38_blacklist.bed \
--fasta Homo_sapiens.GRCh38.dna.primary_assembly.fa \
-p 24 \
--genotype input_bams/*.bam
Snakemake
Introduction
Snakemake
is a workflow engine that provides a readable Python-based workflow definition language and a powerful execution environment that scales from single-core workstations to compute clusters without modifying the workflow.
Versions
6.8.0
Commands
snakemake
Module
You can load the modules by:
module load biocontainers
module load snakemake
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snakemake on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snakemake
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snakemake
Snap
Introduction
Snap
is a semi-HMM-based Nucleic Acid Parser – gene prediction tool.
Versions
2013_11_29
Commands
snap
Module
You can load the modules by:
module load biocontainers
module load snap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snap
Snap-aligner
Introduction
Snap-aligner
(Scalable Nucleotide Alignment Program) is a fast and accurate read aligner for high-throughput sequencing data.
Versions
2.0.0
Commands
snap-aligner
Module
You can load the modules by:
module load biocontainers
module load snap-aligner
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snap-aligner on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snap-aligner
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snap-aligner
Snaptools
Introduction
Snaptools
is a python module for pre-processing and working with snap file.
Versions
1.4.8
Commands
snaptools
Module
You can load the modules by:
module load biocontainers
module load snaptools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snaptools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snaptools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snaptools
Snippy
Introduction
Snippy
is a tool for rapid haploid variant calling and core genome alignment.
Versions
4.6.0
Commands
snippy
snippy-clean_full_aln
snippy-core
snippy-multi
snippy-vcf_extract_subs
snippy-vcf_report
snippy-vcf_to_tab
Module
You can load the modules by:
module load biocontainers
module load snippy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snippy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snippy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snippy
Snp-dists
Introduction
Snp-dists is a tool to convert a FASTA alignment to SNP distance matrix.
Versions
0.8.2
Commands
snp-dists
Module
You can load the modules by:
module load biocontainers
module load snp-dists
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run snp-dists on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snp-dists
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snp-dists
snp-dists test/good.aln > distances.tab
Snpeff
Introduction
Snpeff
is an open source tool that annotates variants and predicts their effects on genes by using an interval forest approach.
Versions
5.1d
5.1
Commands
snpEff
Module
You can load the modules by:
module load biocontainers
module load snpeff
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
By default, snpEff only uses 1gb
of memory. To allocate larger memory, add -Xmx
flag in your command.:
snpeff -Xmx10g ## To allocate 10gb of memory.
To run Snpeff on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snpeff
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snpeff
snpEff GRCh37.75 examples/test.chr22.vcf > test.chr22.ann.vcf
Snpgenie
Introduction
Snpgenie
is a collection of Perl scripts for estimating πN/πS, dN/dS, and gene diversity from next-generation sequencing (NGS) single-nucleotide polymorphism (SNP) variant data.
Versions
1.0
Commands
fasta2revcom.pl
gtf2revcom.pl
snpgenie.pl
snpgenie_between_group.pl
snpgenie_between_group_processor.pl
snpgenie_within_group.pl
snpgenie_within_group_processor.pl
vcf2revcom.pl
Module
You can load the modules by:
module load biocontainers
module load snpgenie
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snpgenie on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snpgenie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snpgenie
snpgenie.pl --minfreq=0.01 --snpreport=CLC_SNP_EXAMPLE.txt \
--fastafile=REFERENCE_EXAMPLE.fasta --gtffile=CDS_EXAMPLE.gtf
Snphylo
Introduction
Snphylo is a pipeline to generate a phylogenetic tree from huge SNP data.
Versions
20180901
Commands
Rscript
snphylo.sh
convert_fasta_to_phylip.py
convert_simple_to_hapmap.py
determine_bs_tree.R
draw_unrooted_tree.R
generate_snp_sequence.R
remove_low_depth_genotype_data.py
remove_no_genotype_data.py
Module
You can load the modules by:
module load biocontainers
module load snphylo
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run snphylo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snphylo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snphylo
Snpsift
Introduction
Snpsift
is a tool used to annotate genomic variants using databases, filters, and manipulates genomic annotated variants.
Versions
4.3.1t
Commands
SnpSift
Module
You can load the modules by:
module load biocontainers
module load snpsift
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snpsift on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snpsift
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snpsift
SnpSift annotate -id dbSnp132.vcf \
variants.vcf > variants_annotated.vcf
Snp-sites
Introduction
SNP-sites is a tool that apidly extracts SNPs from a multi-FASTA alignment.
Versions
2.5.1
Commands
snp-sites
Module
You can load the modules by:
module load biocontainers
module load snp-sites
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run snp-sites on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snp-sites
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snp-sites
snp-sites salmonella_serovars_core_genes.aln
Soapdenovo2
Introduction
Soapdenovo2
is a short-read assembly method to build de novo draft assembly.
Versions
2.40
Commands
SOAPdenovo-127mer
SOAPdenovo-63mer
Module
You can load the modules by:
module load biocontainers
module load soapdenovo2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Soapdenovo2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=soapdenovo2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers soapdenovo2
SOAPdenovo-127mer all -s config_file -K 63 -R -o graph_prefix 1>ass.log 2>ass.err
SortMeRNA
Introduction
SortMeRNA
is a local sequence alignment tool for filtering, mapping and clustering.
Versions
2.1b
4.3.4
Commands
sortmerna
Module
You can load the modules by:
module load biocontainers
module load sortmerna
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run SortMeRNA on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sortmerna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sortmerna
sortmerna --ref silva-bac-16s-id90.fasta,silva-bac-16s-db \
--reads set2_environmental_study_550_amplicon.fasta \
--fastx --aligned Test
Souporcell
Introduction
souporcell is a method for clustering mixed-genotype scRNAseq experiments by individual.
Versions
2.0
Commands
check_modules.py
compile_stan_model.py
consensus.py
renamer.py
retag.py
shared_samples.py
souporcell.py
souporcell_pipeline.py
Module
You can load the modules by:
module load biocontainers
module load souporcell
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run souporcell on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=souporcell
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers souporcell
souporcell_pipeline.py -i A.merged.bam \
-b GSM2560245_barcodes.tsv \
-f refdata-cellranger-GRCh38-3.0.0/fasta/genome.fa \
-t 8 -o demux_data_test -k 4
Sourmash
Introduction
Sourmash
is a tool for quickly search, compare, and analyze genomic and metagenomic data sets.
Versions
4.3.0
4.5.0
Commands
sourmash
Module
You can load the modules by:
module load biocontainers
module load sourmash
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sourmash on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sourmash
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sourmash
sourmash sketch dna -p k=31 *.fna.gz
sourmash compare *.sig -o cmp.dist
sourmash plot cmp.dist --labels
Spaceranger
Introduction
Spaceranger
is a set of analysis pipelines that process Visium Spatial Gene Expression data with brightfield and fluorescence microscope images.
Versions
1.3.0
1.3.1
2.0.0
Commands
spaceranger
Module
You can load the modules by:
module load biocontainers
module load spaceranger
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Spaceranger on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=spaceranger
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers spaceranger
spaceranger count --id=sample345 \ #Output directory
--transcriptome=/opt/refdata/GRCh38-2020-A \ #Path to Reference
--fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \ #Path to FASTQs
--sample=mysample \ #Sample name from FASTQ filename
--image=/home/jdoe/runs/images/sample345.tiff \ #Path to brightfield image
--slide=V19J01-123 \ #Slide ID
--area=A1 \ #Capture area
--localcores=8 \ #Allowed cores in localmode
--localmem=64 #Allowed memory (GB) in localmode
SPAdes
Introduction
SPAdes
- St. Petersburg genome assembler - is an assembly toolkit containing various assembly pipelines.
Detailed usage can be found here: https://github.com/ablab/spades
Versions
3.15.3
3.15.4
3.15.5
Commands
coronaspades.py
metaplasmidspades.py
metaspades.py
metaviralspades.py
plasmidspades.py
rnaspades.py
rnaviralspades.py
spades.py
spades_init.py
truspades.py
spades-bwa
spades-convert-bin-to-fasta
spades-core
spades-corrector-core
spades-gbuilder
spades-gmapper
spades-gsimplifier
spades-hammer
spades-ionhammer
spades-kmer-estimating
spades-kmercount
spades-read-filter
spades-truseq-scfcorrection
Module
You can load the modules by:
module load biocontainers
module load spades
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run spades on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=spades
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers spades
spades.py --pe1-1 SRR11234553_1.fastq --pe1-2 SRR11234553_2.fastq -o spades_out -t 24
Sprod
Introduction
Sprod: De-noising Spatially Resolved Transcriptomics Data Based on Position and Image Information.
Versions
1.0
Commands
python
python3
sprod.py
Module
You can load the modules by:
module load biocontainers
module load sprod
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run sprod on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sprod
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sprod
python3 test_examples.py
Squeezemeta
Introduction
SqueezeMeta is a fully automated metagenomics pipeline, from reads to bins.
Versions
1.5.1
Commands
01.merge_assemblies.pl
01.merge_sequential.pl
01.remap.pl
01.run_assembly.pl
01.run_assembly_merged.pl
02.rnas.pl
03.run_prodigal.pl
04.rundiamond.pl
05.run_hmmer.pl
06.lca.pl
07.fun3assign.pl
08.blastx.pl
09.summarycontigs3.pl
10.mapsamples.pl
11.mcount.pl
12.funcover.pl
13.mergeannot2.pl
14.runbinning.pl
15.dastool.pl
16.addtax2.pl
17.checkM_batch.pl
18.getbins.pl
19.getcontigs.pl
20.minpath.pl
21.stats.pl
SqueezeMeta.pl
SqueezeMeta_conf.pl
SqueezeMeta_conf_original.pl
parameters.pl
restart.pl
add_database.pl
cover.pl
sqm2ipath.pl
sqm2itol.pl
sqm2keggplots.pl
sqm2pavian.pl
sqm_annot.pl
sqm_hmm_reads.pl
sqm_longreads.pl
sqm_mapper.pl
sqm_reads.pl
versionchange.pl
find_missing_markers.pl
remove_duplicate_markers.pl
anvi-filter-sqm.py
anvi-load-sqm.py
sqm2anvio.pl
configure_nodb.pl
configure_nodb_alt.pl
download_databases.pl
make_databases.pl
make_databases_alt.pl
test_install.pl
Module
You can load the modules by:
module load biocontainers
module load squeezemeta
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run squeezemeta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=squeezemeta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers squeezemeta
SqueezeMeta.pl -m coassembly -p Hadza -s test.samples -f raw
Squid
Introduction
SQUID is designed to detect both fusion-gene and non-fusion-gene transcriptomic structural variations from RNA-seq alignment.
Versions
1.5
Commands
squid
AnnotateSQUIDOutput.py
Module
You can load the modules by:
module load biocontainers
module load squid
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run squid on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=squid
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers squid
SRA-Toolkit
Introduction
SRA-Toolkit
is a collection of tools and libraries for using data in the INSDC Sequence Read Archives. Its detailed documentation can be found in https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc.
Versions
2.11.0-pl5262
Commands
abi-dump
align-cache
align-info
bam-load
cache-mgr
cg-load
fasterq-dump
fasterq-dump-orig
fastq-dump
fastq-dump-orig
illumina-dump
kar
kdbmeta
kget
latf-load
md5cp
prefetch
prefetch-orig
rcexplain
read-filter-redact
sam-dump
sam-dump-orig
sff-dump
sra-pileup
sra-pileup-orig
sra-sort
sra-sort-cg
sra-stat
srapath
srapath-orig
sratools
test-sra
vdb-config
vdb-copy
vdb-diff
vdb-dump
vdb-encrypt
vdb-lock
vdb-passwd
vdb-unlock
vdb-validate
Module
You can load the modules by:
module load biocontainers
module load sra-tools/2.11.0-pl5262
Configuring SRA-Toolkit
Users can config SRA-Toolkit by the command vdb-config
. For example, the below command set up the current working directory for downloading:
vdb-config --prefetch-to-cwd
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run SRA-Toolkit on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=SRA-Toolkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sra-tools/2.11.0-pl5262
vdb-config --prefetch-to-cwd # The data will be downloaded to the current working directory.
prefetch SRR11941281
fastq-dump --split-3 SRR11941281/SRR11941281.sra
Srst2
Introduction
Srst2 is designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes. For more information, please check: Docker hub: https://hub.docker.com/r/staphb/srst2 Home page: https://github.com/katholt/srst2
Versions
0.2.0
Commands
getmlst.py
srst2
slurm_srst2.py
Module
You can load the modules by:
module load biocontainers
module load srst2
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run srst2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=srst2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers srst2
Stacks
Introduction
Stacks
is a software pipeline for building loci from RAD-seq.
Versions
2.60
Commands
clone_filter
count_fixed_catalog_snps.py
cstacks
denovo_map.pl
gstacks
integrate_alignments.py
kmer_filter
phasedstacks
populations
process_radtags
process_shortreads
ref_map.pl
sstacks
stacks-dist-extract
stacks-gdb
stacks-integrate-alignments
tsv2bam
ustacks
Module
You can load the modules by:
module load biocontainers
module load stacks
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Stacks on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=stacks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers stacks
denovo_map.pl -T 8 -M 4 -o ./stacks/ \
--samples ./samples --popmap ./popmaps/popmap
STAR
Introduction
STAR
: ultrafast universal RNA-seq aligner.
Detailed usage can be found here: https://github.com/alexdobin/STAR
Versions
2.7.10a
2.7.10b
2.7.9a
Commands
STAR
STARlong
Module
You can load the modules by:
module load biocontainers
module load star/2.7.10a
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run STAR on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=star
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers star/2.7.10a
STAR --runThreadN 24 --runMode genomeGenerate --genomeDir ref_genome --genomeFastaFiles ref_genome.fasta
STAR --runThreadN 24 --genomeDir ref_genome --readFilesIn seq_1.fastq seq_2.fastq --outSAMtype BAM SortedByCoordinate --outWigType wiggle read2
Staramr
Introduction
staramr scans bacterial genome contigs against the ResFinder, PointFinder, and PlasmidFinder databases (used by the ResFinder webservice and other webservices offered by the Center for Genomic Epidemiology) and compiles a summary report of detected antimicrobial resistance genes.
Versions
0.7.1
Commands
staramr
Module
You can load the modules by:
module load biocontainers
module load staramr
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run staramr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=staramr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers staramr
staramr db info
staramr search \
--pointfinder-organism salmonella \
-o out *.fasta
STAR-Fusion
Introduction
STAR-Fusion
is a component of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT).
Versions
1.11b
Commands
STAR-Fusion
Module
You can load the modules by:
module load biocontainers
module load starfusion
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run STAR-Fusion on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=starfusion
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers starfusion
STAR-Fusion --CPU 24 --left_fq ../star/SRR12095148_1.fastq --right_fq ../star/SRR12095148_2.fastq\
--genome_lib_dir GRCh38_gencode_v33_CTAT_lib_Apr062020.plug-n-play/ctat_genome_lib_build_dir \
--FusionInspector validate \
--denovo_reconstruct \
--examine_coding_effect \
--output_dir STAR-Fusion-output
STREAM
Introduction
STREAM
(Single-cell Trajectories Reconstruction, Exploration And Mapping) is an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data.
Versions
1.0
Commands
python
python3
Module
You can load the modules by:
module load biocontainers
module load stream
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run STREAM on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=stream
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers stream
Stringdecomposer
Introduction
Stringdecomposer is a tool for decomposition centromeric assemblies and long reads into monomers.
Versions
1.1.2
Commands
stringdecomposer
Module
You can load the modules by:
module load biocontainers
module load stringdecomposer
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run stringdecomposer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=stringdecomposer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers stringdecomposer
StringTie
Introduction
StringTie
: efficient transcript assembly and quantitation of RNA-Seq data.
Stringtie employs efficient algorithms for transcript structure recovery and abundance estimation from bulk RNA-Seq reads aligned to a reference genome. It takes as input spliced alignments in coordinate-sorted SAM/BAM/CRAM format and produces a GTF output which consists of assembled transcript structures and their estimated expression levels (FPKM/TPM and base coverage values).
Detailed usage can be found here: https://github.com/gpertea/stringtie
Versions
2.1.7
2.2.1
Commands
stringtie
Module
You can load the modules by:
module load biocontainers
module load stringtie
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run stringtie on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=stringtie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers stringtie
stringtie -o SRR11614710.gtf -G Homo_sapiens.GRCh38.105.gtf SRR11614710Aligned.sortedByCoord.out.bam
Strique
Introduction
STRique is a python package to analyze repeat expansion and methylation states of short tandem repeats (STR) in Oxford Nanopore Technology (ONT) long read sequencing data.
Versions
0.4.2
Commands
STRique.py
STRique_test.py
fast5Masker.py
Module
You can load the modules by:
module load biocontainers
module load strique
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run strique on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=strique
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers strique
STRique_test.py
STRique.py index data/ > data/reads.fofn
cat data/c9orf72.sam | STRique.py count ./data/reads.fofn ./models/r9_4_450bps.model ./configs/repeat_config.tsv --config ./configs/STRique.json
Structure
Introduction
Structure is a software package for using multi-locus genotype data to investigate population structure.
Versions
2.3.4
Commands
structure
Module
You can load the modules by:
module load biocontainers
module load structure
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run structure on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=structure
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers structure
Subread
Introduction
Subread
carries out high-performance read alignment, quantification and mutation discovery. It is a general-purpose read aligner which can be used to map both genomic DNA-seq reads and DNA-seq reads. It uses a new mapping paradigm called seed-and-vote to achieve fast, accurate and scalable read mapping. Subread automatically determines if a read should be globally or locally aligned, therefore particularly powerful in mapping RNA-seq reads. It supports INDEL detection and can map reads with both fixed and variable lengths.
Versions
1.6.4
2.0.1
Commands
detectionCall
exactSNP
featureCounts
flattenGTF
genRandomReads
propmapped
qualityScores
removeDup
repair
subindel
subjunc
sublong
subread-align
subread-buildindex
subread-fullscan
txUnique
Module
You can load the modules by:
module load biocontainers
module load subread
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Subread on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=subread
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers subread
featureCounts -s 2 -p -Q 10 -T 4 -a genome.gtf -o featurecounts.txt mapped.bam
Survivor
Introduction
SURVIVOR is a tool set for simulating/evaluating SVs, merging and comparing SVs within and among samples, and includes various methods to reformat or summarize SVs.
Versions
1.0.7
Commands
SURVIVOR
Module
You can load the modules by:
module load biocontainers
module load survivor
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run survivor on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=survivor
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers survivor
SURVIVOR simSV parameter_file
SURVIVOR simSV ref.fa parameter_file 0.1 0 simulated
SURVIVOR eval caller.vcf simulated.bed 10 eval_res
~
Svaba
Introduction
SvABA is a method for detecting structural variants in sequencing data using genome-wide local assembly.
Versions
1.1.0
Commands
svaba
Module
You can load the modules by:
module load biocontainers
module load svaba
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run svaba on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=svaba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers svaba
DBSNP=dbsnp_indel.vcf
TUM_BAM=G15512.HCC1954.1.COST16011_region.bam
NORM_BAM=HCC1954.NORMAL.30x.compare.COST16011_region.bam
CORES=8 ## set any number of cores
REF=Homo_sapiens_assembly19.COST16011_region.fa
svaba run -t $TUM_BAM -n $NORM_BAM \
-p $CORES -D $DBSNP \
-a somatic_run -G $REF
Svtools
Introduction
Svtools is a suite of utilities designed to help bioinformaticians construct and explore cohort-level structural variation calls.
Versions
0.5.1
Commands
svtools
Module
You can load the modules by:
module load biocontainers
module load svtools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run svtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=svtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers svtools
Svtyper
Introduction
SVTyper performs breakpoint genotyping of structural variants (SVs) using whole genome sequencing data. svtyper is the original implementation of the genotyping algorithm, and works with multiple samples. svtyper-sso is an alternative implementation of svtyper that is optimized for genotyping a single sample. svtyper-sso is a parallelized implementation of svtyper that takes advantage of multiple CPU cores via the multiprocessing module. svtyper-sso can offer a 2x or more speedup (depending on how many CPU cores used) in genotyping a single sample. NOTE: svtyper-sso is not yet stable. There are minor logging differences between the two and svtyper-sso may exit with an error prematurely when processing CRAM files.
Versions
0.7.1
Commands
svtyper
svtyper-sso
python
python2
Module
You can load the modules by:
module load biocontainers
module load svtyper
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run svtyper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=svtyper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers svtyper
svtyper \
-i data/example.vcf \
-B data/NA12878.target_loci.sorted.bam \
-l data/NA12878.bam.json \
> out.vcf
swat
Introduction
swat
is a program for searching one or more DNA or protein query sequences, or a query profile, against a sequence database, using an efficient implementation of the Smith-Waterman or Needleman-Wunsch algorithms with linear (affine) gap penalties.
Versions
1.090518
Commands
swat
Module
You can load the modules by:
module load biocontainers
module load swat
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run swat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=swat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers swat
Syri
Introduction
Syri compares alignments between two chromosome-level assemblies and identifies synteny and structural rearrangements.
Versions
1.6
Commands
syri
Module
You can load the modules by:
module load biocontainers
module load syri
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run syri on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=syri
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers syri
syri -c out.sam -r refgenome -q qrygenome -k -F S
Talon
Introduction
Talon
is a Python package for identifying and quantifying known and novel genes/isoforms in long-read transcriptome data sets.
Versions
5.0
Commands
talon
talon_abundance
talon_create_GTF
talon_fetch_reads
talon_filter_transcripts
talon_generate_report
talon_get_sjs
talon_initialize_database
talon_label_reads
talon_reformat_gtf
talon_summarize
Module
You can load the modules by:
module load biocontainers
module load talon
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Talon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=talon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers talon
Targetp
Introduction
TargetP-2.0 tool predicts the presence of N-terminal presequences: signal peptide (SP), mitochondrial transit peptide (mTP), chloroplast transit peptide (cTP) or thylakoid luminal transit peptide (luTP). For the sequences predicted to contain an N-terminal presequence a potential cleavage site is also predicted.
Versions
2.0
Commands
targetp
Module
You can load the modules by:
module load biocontainers
module load targetp
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run targetp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=targetp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers targetp
Tassel
Introduction
TASSEL is a software package used to evaluate traits associations, evolutionary patterns, and linkage disequilibrium.
Versions
5.0
Commands
run_pipeline.pl
start_tassel.pl
Tassel5
Module
You can load the modules by:
module load biocontainers
module load tassel
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run tassel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tassel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tassel
Taxonkit
Introduction
Taxonkit
is a practical and efficient NCBI taxonomy toolkit.
Versions
0.9.0
Commands
taxonkit
Module
You can load the modules by:
module load biocontainers
module load taxonkit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Taxonkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=taxonkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers taxonkit
taxonkit list --show-rank --show-name --indent " " --ids 9605,239934
T-coffee
Introduction
T-coffee
is a multiple sequence alignment software using a progressive approach.
Versions
13.45.0.4846264
Commands
t_coffee
Module
You can load the modules by:
module load biocontainers
module load t-coffee
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run T-coffee on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=t-coffee
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers t-coffee
t_coffee OG0002077.fa -mode expresso
Tetranscripts
Introduction
Tetranscripts
is a package for including transposable elements in differential enrichment analysis of sequencing datasets.
Versions
2.2.1
Commands
TEtranscripts
TEcount
Module
You can load the modules by:
module load biocontainers
module load tetranscripts
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Tetranscripts on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tetranscripts
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tetranscripts
TEtranscripts --format BAM --mode multi \
-t treatment_sample1.bam treatment_sample2.bam treatment_sample3.bam \
-c control_sample1.bam control_sample2.bam control_sample3.bam \
--GTF genic-GTF-file \
--GTF genic-GTF-file \
--project sample_nosort_test
Tiara
Introduction
Tiara
is a deep-learning-based approach for identification of eukaryotic sequences in the metagenomic data powered by PyTorch.
Versions
1.0.2
Commands
tiara
Module
You can load the modules by:
module load biocontainers
module load tiara
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Tiara on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=tiara
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tiara
tiara -t 24 -i archaea_fr.fasta -o archaea_out.txt
tiara -t 24 -i bacteria_fr.fasta -o bacteria_out.txt
tiara -t 24 -i eukarya_fr.fasta -o eukarya_out.txt
tiara -t 24 -i mitochondria_fr.fasta -o mitochondria_out.txt
tiara -t 24 -i plast_fr.fasta -o plast_out.txt
tiara -t 24 -i total.fasta -o mix_out.txt --tf all -p 0.65 0.60 --probabilities
Tigmint
Introduction
Tigmint identifies and corrects misassemblies using linked (e.g. MGI’s stLFR, 10x Genomics Chromium) or long (e.g. Oxford Nanopore Technologies long reads) DNA sequencing reads. The reads are first aligned to the assembly, and the extents of the large DNA molecules are inferred from the alignments of the reads. The physical coverage of the large molecules is more consistent and less prone to coverage dropouts than that of the short read sequencing data. The sequences are cut at positions that have insufficient spanning molecules. Tigmint outputs a BED file of these cut points, and a FASTA file of the cut sequences. For more information, please check: Home page: https://github.com/bcgsc/tigmint
Versions
1.2.6
Commands
tigmint
tigmint-arcs-tsv
tigmint-cut
tigmint-make
tigmint_estimate_dist.py
tigmint_molecule.py
tigmint_molecule_paf.py
Module
You can load the modules by:
module load biocontainers
module load tigmint
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run tigmint on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tigmint
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tigmint
Tobias
Introduction
Tobias
is a collection of command-line bioinformatics tools for performing footprinting analysis on ATAC-seq data.
Versions
0.13.3
Commands
TOBIAS
Module
You can load the modules by:
module load biocontainers
module load tobias
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Tobias on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=tobias
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tobias
TOBIAS DownloadData --bucket data-tobias-2020
mv data-tobias-2020/ test_data/
TOBIAS PlotAggregate --TFBS test_data/BATF_all.bed \
--signals test_data/Bcell_corrected.bw test_data/Tcell_corrected.bw \
--output BATFJUN_footprint_comparison_all.pdf \
--share_y both --plot_boundaries --signal-on-x
TOBIAS BINDetect --motifs test_data/motifs.jaspar \
--signals test_data/Bcell_footprints.bw test_data/Tcell_footprints.bw \
--genome test_data/genome.fa.gz \
--peaks test_data/merged_peaks_annotated.bed \
--peak_header test_data/merged_peaks_annotated_header.txt \
--outdir BINDetect_output --cond_names Bcell Tcell --cores 8
TOBIAS ATACorrect --bam test_data/Bcell.bam \
--genome test_data/genome.fa.gz \
--peaks test_data/merged_peaks.bed \
--blacklist test_data/blacklist.bed \
--outdir ATACorrect_test --cores 8
TOBIAS FootprintScores --signal test_data/Bcell_corrected.bw \
--regions test_data/merged_peaks.bed \
--output Bcell_footprints.bw --cores 8
Tombo
Introduction
Tombo
is a suite of tools primarily for the identification of modified nucleotides from nanopore sequencing data. Tombo also provides tools for the analysis and visualization of raw nanopore signal.
Versions
1.5.1
Commands
tombo
Module
You can load the modules by:
module load biocontainers
module load tombo
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Tombo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=tombo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tombo
tombo resquiggle path/to/fast5s/ genome.fasta --processes 4 --num-most-common-errors 5
tombo detect_modifications alternative_model --fast5-basedirs path/to/fast5s/ \
--statistics-file-basename native.e_coli_sample \
--alternate-bases dam dcm --processes 4
# plot raw signal at most significant dcm locations
tombo plot most_significant --fast5-basedirs path/to/fast5s/ \
--statistics-filename native.e_coli_sample.dcm.tombo.stats \
--plot-standard-model --plot-alternate-model dcm \
--pdf-filename sample.most_significant_dcm_sites.pdf
# produces wig file with estimated fraction of modified reads at each valid reference site
tombo text_output browser_files --statistics-filename native.e_coli_sample.dam.tombo.stats \
--file-types dampened_fraction --browser-file-basename native.e_coli_sample.dam
# also produce successfully processed reads coverage file for reference
tombo text_output browser_files --fast5-basedirs path/to/fast5s/ \
--file-types coverage --browser-file-basename native.e_coli_sample
TopHat
Introduction
TopHat
is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.
Versions
2.1.1-py27
Commands
tophat
tophat2
Module
You can load the modules by:
module load biocontainers
module load tophat
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run TopHat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tophat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tophat
tophat -r 20 test_ref reads_1.fq reads_2.fq
TPMCalculator
Introduction
TPMCalculator
quantifies mRNA abundance directly from the alignments by parsing BAM files.
Detailed usage can be found here: https://github.com/ncbi/TPMCalculator
Versions
0.0.3
0.0.4
Commands
TPMCalculator
Module
You can load the modules by:
module load biocontainers
module load tpmcalculator
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run tpmcalculator on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=tpmcalculator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transdecoder
TPMCalculator -g Homo_sapiens.GRCh38.105.chr.gtf -b SRR12095148Aligned.sortedByCoord.out.bam
Transabyss
Introduction
Transabyss
is a tool for De novo assembly of RNAseq data using ABySS.
Versions
2.0.1
Commands
transabyss
transabyss-merge
Module
You can load the modules by:
module load biocontainers
module load transabyss
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Transabyss on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=transabyss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transabyss
transabyss --name SRR12095148 \
--pe SRR12095148_1.fastq SRR12095148_2.fastq \
--outdir SRR12095148_assembly --threads 12
TransDecoder
Introduction
TransDecoder
identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.
TransDecoder identifies likely coding sequences based on the following criteria:
a minimum length open reading frame (ORF) is found in a transcript sequence
a log-likelihood score similar to what is computed by the GeneID software is > 0.
the above coding score is greatest when the ORF is scored in the 1st reading frame as compared to scores in the other 2 forward reading frames.
if a candidate ORF is found fully encapsulated by the coordinates of another candidate ORF, the longer one is reported. However, a single transcript can report multiple ORFs (allowing for operons, chimeras, etc).
a PSSM is built/trained/used to refine the start codon prediction.
optional the putative peptide has a match to a Pfam domain above the noise cutoff score.
Detailed usage can be found here: https://github.com/TransDecoder/TransDecoder/wiki#running-transdecoder
Versions
5.5.0
Commands
TransDecoder.LongOrfs
TransDecoder.Predict
cdna_alignment_orf_to_genome_orf.pl
compute_base_probs.pl
exclude_similar_proteins.pl
fasta_prot_checker.pl
ffindex_resume.pl
gene_list_to_gff.pl
get_FL_accs.pl
get_longest_ORF_per_transcript.pl
get_top_longest_fasta_entries.pl
gff3_file_to_bed.pl
gff3_file_to_proteins.pl
gff3_gene_to_gtf_format.pl
gtf_genome_to_cdna_fasta.pl
gtf_to_alignment_gff3.pl
gtf_to_bed.pl
nr_ORFs_gff3.pl
pfam_runner.pl
refine_gff3_group_iso_strip_utrs.pl
refine_hexamer_scores.pl
remove_eclipsed_ORFs.pl
score_CDS_likelihood_all_6_frames.pl
select_best_ORFs_per_transcript.pl
seq_n_baseprobs_to_loglikelihood_vals.pl
start_codon_refinement.pl
train_start_PWM.pl
uri_unescape.pl
Module
You can load the modules by:
module load biocontainers
module load transdecoder
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run transdecoder on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=transdecoder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transdecoder
gtf_genome_to_cdna_fasta.pl transcripts.gtf test.genome.fasta > transcripts.fasta
gtf_to_alignment_gff3.pl transcripts.gtf > transcripts.gff3
TransDecoder.LongOrfs -t transcripts.fasta
TransDecoder.Predict -t transcripts.fasta
Transrate
Introduction
Transrate is software for de-novo transcriptome assembly quality analysis.
Versions
1.0.3
Commands
transrate
Module
You can load the modules by:
module load biocontainers
module load transrate
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run transrate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=transrate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transrate
transrate --assembly mm10/Mus_musculus.GRCm38.cds.all.fa \
--left seq_1.fq.gz \
--right seq_2.fq.gz \
--threads 12
Transvar
Introduction
Transvar
is a multi-way annotator for genetic elements and genetic variations.
Versions
2.5.9
Commands
transvar
Module
You can load the modules by:
module load biocontainers
module load transvar
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Transvar on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=transvar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transvar
# set up databases
transvar config --download_anno --refversion hg19
# in case you don't have a reference
transvar config --download_ref --refversion hg19
transvar panno -i 'PIK3CA:p.E545K' --ucsc --ccds
tRAX
Introduction
tRAX
(tRNA Analysis of eXpression) is a software package built for in-depth analyses of tRNA-derived small RNAs (tDRs), mature tRNAs, and inference of RNA modifications from high-throughput small RNA sequencing data.
Versions
1.0.0
Commands
TestRun.bash
quickdb.bash
maketrnadb.py
trimadapters.py
processamples.py
Module
You can load the modules by:
module load biocontainers
module load trax
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run tRAX on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trax
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trax
Treetime
Introduction
Treetime
is a tool for maximum likelihood dating and ancestral sequence inference.
Versions
0.8.6
0.9.4
Commands
treetime
Module
You can load the modules by:
module load biocontainers
module load treetime
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Treetime on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=treetime
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers treetime
treetime ancestral --aln input.fasta --tree input.nwk
Trimal
Introduction
Trimal
is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment.
Versions
1.4.1
Commands
trimal
readal
statal
Module
You can load the modules by:
module load biocontainers
module load trimal
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trimal on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trimal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trimal
trimal -in input.fasta -out output1 -htmlout output1.html -gt 1
Trim-galore
Introduction
Trim-galore
is a wrapper tool that automates quality and adapter trimming to FastQ files.
Versions
0.6.7
Commands
trim_galore
Module
You can load the modules by:
module load biocontainers
module load trim-galore
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trim-galore on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=trim-galore
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trim-galore
trim_galore --paired --fastqc --length 20 -o sample1_trimmed Sample1_1.fq Sample1_2.fq
Trimmomatic
Introduction
Trimmomatic
is a flexible read trimming tool for Illumina NGS data.
Versions
0.39
Commands
trimmomatic
Module
You can load the modules by:
module load biocontainers
module load trimmomatic
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trimmomatic on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=trimmomatic
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trimmomatic
trimmomatic PE -threads 8 \
input_forward.fq.gz input_reverse.fq.gz \
output_forward_paired.fq.gz output_forward_unpaired.fq.gz \
output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz \
ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:True LEADING:3 TRAILING:3 MINLEN:36
Trinity
Introduction
Trinity
assembles transcript sequences from Illumina RNA-Seq data.
Versions
2.12.0
2.13.2
2.14.0
2.15.0
Commands
Trinity
TrinityStats.pl
Trinity_gene_splice_modeler.py
ace2sam
align_and_estimate_abundance.pl
analyze_blastPlus_topHit_coverage.pl
analyze_diff_expr.pl
blast2sam.pl
bowtie
bowtie2
bowtie2-build
bowtie2-inspect
bowtie2sam.pl
contig_ExN50_statistic.pl
define_clusters_by_cutting_tree.pl
export2sam.pl
extract_supertranscript_from_reference.py
filter_low_expr_transcripts.pl
get_Trinity_gene_to_trans_map.pl
insilico_read_normalization.pl
interpolate_sam.pl
jellyfish
novo2sam.pl
retrieve_sequences_from_fasta.pl
run_DE_analysis.pl
sam2vcf.pl
samtools
samtools.pl
seq_cache_populate.pl
seqtk-trinity
sift_bam_max_cov.pl
soap2sam.pl
tabix
trimmomatic
wgsim
wgsim_eval.pl
zoom2sam.pl
Module
You can load the modules by:
module load biocontainers
module load trinity
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trinity on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 6
#SBATCH --job-name=trinity
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trinity
Trinity --seqType fq --left reads_1.fq --right reads_2.fq \
--CPU 6 --max_memory 20G
Trinotate
Introduction
Trinotate
is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms.
Versions
3.2.2
Commands
Trinotate
Build_Trinotate_Boilerplate_SQLite_db.pl
EMBL_dat_to_Trinotate_sqlite_resourceDB.pl
EMBL_swissprot_parser.pl
PFAM_dat_parser.pl
PFAMtoGoParser.pl
RnammerTranscriptome.pl
TrinotateSeqLoader.pl
Trinotate_BLAST_loader.pl
Trinotate_GO_to_SLIM.pl
Trinotate_GTF_loader.pl
Trinotate_GTF_or_GFF3_annot_prep.pl
Trinotate_PFAM_loader.pl
Trinotate_RNAMMER_loader.pl
Trinotate_SIGNALP_loader.pl
Trinotate_TMHMM_loader.pl
Trinotate_get_feature_name_encoding_attributes.pl
Trinotate_report_writer.pl
assign_eggnog_funccats.pl
autoTrinotate.pl
build_DE_cache_tables.pl
cleanMe.pl
cleanme.pl
count_table_fields.pl
create_clusters_tables.pl
extract_GO_assignments_from_Trinotate_xls.pl
extract_GO_for_BiNGO.pl
extract_specific_genes_from_all_matrices.pl
import_DE_results.pl
import_Trinotate_xls_as_annot.pl
import_expression_and_DE_results.pl
import_expression_matrix.pl
import_samples_n_expression_matrix.pl
import_samples_only.pl
import_transcript_annotations.pl
import_transcript_clusters.pl
import_transcript_names.pl
init_Trinotate_sqlite_db.pl
legacy_blast.pl
make_cXp_html.pl
obo_tab_to_sqlite_db.pl
obo_to_tab.pl
prep_nuc_prot_set_for_trinotate_loading.pl
print.pl
rnammer_supperscaffold_gff_to_indiv_transcripts.pl
runMe.pl
run_TrinotateWebserver.pl
run_cluster_functional_enrichment_analysis.pl
shrink_db.pl
sqlite.pl
superScaffoldGenerator.pl
test_Barplot.pl
test_GO_DAG.pl
test_GenomeBrowser.pl
test_Heatmap.pl
test_Lineplot.pl
test_Piechart.pl
test_Scatter2D.pl
test_Sunburst.pl
trinotate_report_summary.pl
update_blastdb.pl
update_seq_n_annotation_fields.pl
Module
You can load the modules by:
module load biocontainers
module load trinotate
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trinotate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trinotate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trinotate
sqlite_db="myTrinotate.sqlite"
Trinotate ${sqlite_db} init \
--gene_trans_map data/Trinity.fasta.gene_to_trans_map \
--transcript_fasta data/Trinity.fasta \
--transdecoder_pep \
data/Trinity.fasta.transdecoder.pep
Trinotate ${sqlite_db} LOAD_swissprot_blastp data/swissprot.blastp.outfmt6
Trinotate ${sqlite_db} LOAD_pfam data/TrinotatePFAM.out
Trnascan-se
Introduction
Trnascan-se
is a convenient, ready-for-use means to identify tRNA genes in one or more query sequences.
Versions
2.0.9
Commands
tRNAscan-SE
Module
You can load the modules by:
module load biocontainers
module load trnascan-se
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trnascan-se on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=trnascan-se
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trnascan-se
tRNAscan-SE --thread 12 -o tRNA.out \
-f rRNA.ss -m tRNA.stats genome.fasta
Trtools
Introduction
TRTools includes a variety of utilities for filtering, quality control and analysis of tandem repeats downstream of genotyping them from next-generation sequencing.
Versions
5.0.1
Commands
associaTR
compareSTR
dumpSTR
mergeSTR
qcSTR
statSTR
Module
You can load the modules by:
module load biocontainers
module load trtools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Warning
We noticed that xalt
module can cause the failure of certain commands including statSTR
. Please unload all loaded modules by module --force purge
before loading required modules.
To run trtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trtools htslib bcftools
mergeSTR --vcfs ceu_ex.vcf.gz,yri_ex.vcf.gz --out merged
bgzip merged.vcf
tabix -p vcf merged.vcf.gz
# Get the CEU and YRI sample lists
bcftools query -l yri_ex.vcf.gz > yri_samples.txt
bcftools query -l ceu_ex.vcf.gz > ceu_samples.txt
# Run statSTR on region chr21:35348646-35348646 (hg38)
statSTR \
--vcf merged.vcf.gz \
--samples yri_samples.txt,ceu_samples.txt \
--sample-prefixes YRI,CEU \
--out stdout \
--mean --het --acount \
--use-length \
--region chr21:34351482-34363028
Trust4
Introduction
Tcr Receptor Utilities for Solid Tissue (TRUST) is a computational tool to analyze TCR and BCR sequences using unselected RNA sequencing data, profiled from solid tissues, including tumors.
Versions
1.0.7
Commands
run-trust4
BuildDatabaseFa.pl
BuildImgtAnnot.pl
trust-airr.pl
trust-barcoderep.pl
trust-simplerep.pl
trust-smartseq.pl
Module
You can load the modules by:
module load biocontainers
module load trust4
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run trust4 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trust4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trust4
run-trust4 -b mapped.bam -f hg38_bcrtcr.fa --ref human_IMGT+C.fa
Trycycler
Introduction
Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes. I.e. if you have multiple long-read assemblies for the same isolate, Trycycler can combine them into a single assembly that is better than any of your inputs.
Versions
0.5.0
0.5.3
0.5.4
Commands
trycycler
Module
You can load the modules by:
module load biocontainers
module load trycycler
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run trycycler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trycycler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trycycler
trycycler cluster --assemblies \
test/test_cluster/assembly_*.fasta \
--read test/test_cluster/reads.fastq \
--out_dir trycycler_out
UCSC Executables
Introduction
UCSC Executables
is a variety of executables that perform functions ranging from sequence analysis and format conversion, to basic number crunching and statistics, to complex database generation and manipulation.
These executables have been downloaded from http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64.v369/ and made available on RCAC clusters.
Versions
369
Commands
addCols
ameme
autoDtd
autoSql
autoXml
ave
aveCols
axtChain
axtSort
axtSwap
axtToMaf
axtToPsl
bamToPsl
barChartMaxLimit
bedClip
bedCommonRegions
bedCoverage
bedExtendRanges
bedGeneParts
bedGraphPack
bedGraphToBigWig
bedIntersect
bedItemOverlapCount
bedJoinTabOffset
bedJoinTabOffset.py
bedMergeAdjacent
bedPartition
bedPileUps
bedRemoveOverlap
bedRestrictToPositions
bedSingleCover.pl
bedSort
bedToBigBed
bedToExons
bedToGenePred
bedToPsl
bedWeedOverlapping
bigBedInfo
bigBedNamedItems
bigBedSummary
bigBedToBed
bigGenePredToGenePred
bigHeat
bigMafToMaf
bigPslToPsl
bigWigAverageOverBed
bigWigCat
bigWigCluster
bigWigCorrelate
bigWigInfo
bigWigMerge
bigWigSummary
bigWigToBedGraph
bigWigToWig
binFromRange
blastToPsl
blastXmlToPsl
blat
calc
catDir
catUncomment
chainAntiRepeat
chainBridge
chainCleaner
chainFilter
chainMergeSort
chainNet
chainPreNet
chainScore
chainSort
chainSplit
chainStitchId
chainSwap
chainToAxt
chainToPsl
chainToPslBasic
checkAgpAndFa
checkCoverageGaps
checkHgFindSpec
checkTableCoords
chopFaLines
chromGraphFromBin
chromGraphToBin
chromToUcsc
clusterGenes
clusterMatrixToBarChartBed
colTransform
countChars
cpg_lh
crTreeIndexBed
crTreeSearchBed
dbSnoop
dbTrash
endsInLf
estOrient
expMatrixToBarchartBed
faAlign
faCmp
faCount
faFilter
faFilterN
faFrag
faNoise
faOneRecord
faPolyASizes
faRandomize
faRc
faSize
faSomeRecords
faSplit
faToFastq
faToTab
faToTwoBit
faToVcf
faTrans
fastqStatsAndSubsample
fastqToFa
featureBits
fetchChromSizes
findMotif
fixStepToBedGraph.pl
gapToLift
genePredCheck
genePredFilter
genePredHisto
genePredSingleCover
genePredToBed
genePredToBigGenePred
genePredToFakePsl
genePredToGtf
genePredToMafFrames
genePredToProt
gensub2
getRna
getRnaPred
gff3ToGenePred
gff3ToPsl
gmtime
gtfToGenePred
headRest
hgBbiDbLink
hgFakeAgp
hgFindSpec
hgGcPercent
hgGoldGapGl
hgLoadBed
hgLoadChain
hgLoadGap
hgLoadMaf
hgLoadMafSummary
hgLoadNet
hgLoadOut
hgLoadOutJoined
hgLoadSqlTab
hgLoadWiggle
hgSpeciesRna
hgTrackDb
hgWiggle
hgsql
hgsqldump
hgvsToVcf
hicInfo
htmlCheck
hubCheck
hubClone
hubPublicCheck
ixIxx
lastz-1.04.00
lastz_D-1.04.00
lavToAxt
lavToPsl
ldHgGene
liftOver
liftOverMerge
liftUp
linesToRa
localtime
mafAddIRows
mafAddQRows
mafCoverage
mafFetch
mafFilter
mafFrag
mafFrags
mafGene
mafMeFirst
mafNoAlign
mafOrder
mafRanges
mafSpeciesList
mafSpeciesSubset
mafSplit
mafSplitPos
mafToAxt
mafToBigMaf
mafToPsl
mafToSnpBed
mafsInRegion
makeTableList
maskOutFa
matrixClusterColumns
matrixMarketToTsv
matrixNormalize
mktime
mrnaToGene
netChainSubset
netClass
netFilter
netSplit
netSyntenic
netToAxt
netToBed
newProg
newPythonProg
nibFrag
nibSize
oligoMatch
overlapSelect
para
paraFetch
paraHub
paraHubStop
paraNode
paraNodeStart
paraNodeStatus
paraNodeStop
paraSync
paraTestJob
parasol
positionalTblCheck
pslCDnaFilter
pslCat
pslCheck
pslDropOverlap
pslFilter
pslHisto
pslLiftSubrangeBlat
pslMap
pslMapPostChain
pslMrnaCover
pslPairs
pslPartition
pslPosTarget
pslPretty
pslRc
pslRecalcMatch
pslRemoveFrameShifts
pslReps
pslScore
pslSelect
pslSomeRecords
pslSort
pslSortAcc
pslStats
pslSwap
pslToBed
pslToBigPsl
pslToChain
pslToPslx
pslxToFa
qaToQac
qacAgpLift
qacToQa
qacToWig
raSqlQuery
raToLines
raToTab
randomLines
rmFaDups
rowsToCols
sizeof
spacedToTab
splitFile
splitFileByColumn
sqlToXml
strexCalc
stringify
subChar
subColumn
tabQuery
tailLines
tdbQuery
tdbRename
tdbSort
textHistogram
tickToDate
toLower
toUpper
trackDbIndexBb
transMapPslToGenePred
trfBig
twoBitDup
twoBitInfo
twoBitMask
twoBitToFa
ucscApiClient
udr
vai.pl
validateFiles
validateManifest
varStepToBedGraph.pl
webSync
wigCorrelate
wigEncode
wigToBigWig
wordLine
xmlCat
xmlToSql
Module
You can load the modules by:
module load biocontainers
module load ucsc_genome_toolkit/369
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run UCSC executables on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=UCSC
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ucsc_genome_toolkit/369
blat genome.fasta input.fasta blat.out
fastqToFa input.fastq output.fasta
Umi_tools
Introduction
Umi_tools is a collection of tools for handling Unique Molecular Identifiers in NGS data sets.
Versions
1.1.4
Commands
umi_tools
Module
You can load the modules by:
module load biocontainers
module load umi_tools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run umi_tools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=umi_tools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers umi_tools
Unicycler
Introduction
Unicycler
is an assembly pipeline for bacterial genomes.
Versions
0.5.0
Commands
unicycler
Module
You can load the modules by:
module load biocontainers
module load unicycler
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Unicycler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=unicycler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers unicycler
unicycler -t 12 -1 SRR11234553_1.fastq -2 SRR11234553_2.fastq -o shortout
unicycler -t 12 -l SRR3982487.fastq -o longout
Usefulaf
Introduction
Usefulaf is an all-in-one Docker/Singularity image for single-cell processing with Alevin-fry(paper). It includes the all tools you need to turn your FASTQ files into a count matrix and then load it into your favorite analysis environment.
Versions
0.9.2
Commands
simpleaf
R
Rscript
python
python3
Module
You can load the modules by:
module load biocontainers
module load usefulaf
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run usefulaf on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=usefulaf
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers usefulaf
Vadr
Introduction
VADR is a suite of tools for classifying and analyzing sequences homologous to a set of reference models of viral genomes or gene families. It has been mainly tested for analysis of Norovirus, Dengue, and SARS-CoV-2 virus sequences in preparation for submission to the GenBank database.
Versions
1.4.1
1.4.2
1.5
Commands
parse_blast.pl
v-annotate.pl
v-build.pl
v-test.pl
Module
You can load the modules by:
module load biocontainers
module load vadr
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vadr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vadr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vadr
v-annotate.pl noro.9.fa va-noro.9
Vardict-java
Introduction
VarDictJava is a variant discovery program written in Java and Perl. It is a Java port of VarDict variant caller.
Versions
1.8.3
Commands
vardict-java
var2vcf_paired.pl
var2vcf_valid.pl
testsomatic.R
teststrandbias.R
Module
You can load the modules by:
module load biocontainers
module load vardict-java
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vardict-java on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vardict-java
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vardict-java
AF_THR="0.01" # minimum allele frequency
vardict-java -G genome.fasta \
-f $AF_THR -N genome \
-b input.bam \
-c 1 -S 2 -E 3 -g 4 output.bed \
| teststrandbias.R \
| var2vcf_valid.pl \
-N genome -E -f $AF_THR \
> vars.vcf
Varlociraptor
Introduction
Varlociraptor
implements a novel, unified fully uncertainty-aware approach to genomic variant calling in arbitrary scenarios.
Versions
4.11.4
Commands
varlociraptor
Module
You can load the modules by:
module load biocontainers
module load varlociraptor
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Varlociraptor on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=varlociraptor
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers varlociraptor
varlociraptor call variants tumor-normal --purity 0.75 --tumor
Varscan
Introduction
Varscan
is a tool used for variant detection in massively parallel sequencing data.
Versions
2.4.2
2.4.4
Commands
VarScan.v2.4.4.jar
Module
You can load the modules by:
module load biocontainers
module load varscan
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Varscan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=varscan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers varscan
Vartrix
Introduction
Vartrix
is a software tool for extracting single cell variant information from 10x Genomics single cell data.
Versions
1.1.22
Commands
vartrix
Module
You can load the modules by:
module load biocontainers
module load vartrix
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Vartrix on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vartrix
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vartrix
vartrix -v test/test.vcf -b test/test.bam \
-f test/test.fa -c test/barcodes.tsv \
-o output.matrix
Vatools
Introduction
VAtools is a python package that includes several tools to annotate VCF files with data from other tools.
Versions
5.0.1
Commands
ref-transcript-mismatch-reporter
transform-split-values
vcf-expression-annotator
vcf-genotype-annotator
vcf-info-annotator
vcf-readcount-annotator
vep-annotation-reporter
Module
You can load the modules by:
module load biocontainers
module load vatools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vatools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vatools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vatools
vcf-readcount-annotator <input_vcf> <snv_bam_readcount_file> <DNA| RNA> \
-s <sample_name> -t snv -o <snv_annotated_vcf>
Vcf2maf
Introduction
To convert a VCF into a MAF, each variant must be mapped to only one of all possible gene transcripts/isoforms that it might affect. This selection of a single effect per variant, is often subjective. So this project is an attempt to make the selection criteria smarter, reproducible, and more configurable. And the default criteria must lean towards best practices.
Versions
1.6.21
Commands
maf2maf.pl
maf2vcf.pl
vcf2maf.pl
vcf2vcf.pl
Module
You can load the modules by:
module load biocontainers
module load vcf2maf
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
If users need to use vep
, please add --vep-path /opt/conda/bin
.
To run vcf2maf on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf2maf
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcf2maf
vcf2maf.pl --vep-path /opt/conda/bin \
--ref-fasta Homo_sapiens.GRCh37.dna.toplevel.fa.gz \
--input-vcf tests/test.vcf --output-maf test.vep.maf
Vcf2phylip
Introduction
vcf2phylip is a tool to convert SNPs in VCF format to PHYLIP, NEXUS, binary NEXUS, or FASTA alignments for phylogenetic analysis.
Versions
2.8
Commands
vcf2phylip.py
Module
You can load the modules by:
module load biocontainers
module load vcf2phylip
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vcf2phylip on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf2phylip
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcf2phylip
vcf2phylip --input myfile.vcf
Vcf2tsvpy
Introduction
Vcf2tsvpy is a small Python program that converts genomic variant data encoded in VCF format into a tab-separated values (TSV) file.
Versions
0.6.0
Commands
vcf2tsvpy
Module
You can load the modules by:
module load biocontainers
module load vcf2tsvpy
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vcf2tsvpy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf2tsvpy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcf2tsvpy
Vcf-kit
Introduction
VCF-kit is a command-line based collection of utilities for performing analysis on Variant Call Format (VCF) files.
Versions
0.2.6
0.2.9
Commands
vk
Module
You can load the modules by:
module load biocontainers
module load vcf-kit
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vcf-kit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf-kit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcf-kit
VCFtools
Introduction
VCFtools
is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.
Versions
0.1.16
Commands
vcftools
Module
You can load the modules by:
module load biocontainers
module load vartrix
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run VCFtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcftools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcftools
vcftools --vcf input_data.vcf --chr 1 \
--from-bp 1000000 --to-bp 2000000
Velocyto.py
Introduction
Velocyto.py
a library for the analysis of RNA velocity.
Detailed information about velocyto.py can be found here: https://github.com/velocyto-team/velocyto.py.
Versions
0.17.17
Commands
python
python3
velocyto
Module
You can load the modules by:
module load biocontainers
module load velocyto.py/0.17.17-py39
Interactive job
To run Velocyto.py
interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cellrank/1.5.1
(base) UserID@bell-a008:~ $ python
Python 3.9.10 | packaged by conda-forge | (main, Feb 1 2022, 21:24:11)
[GCC 9.4.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import velocyto as vcy
>>> vlm = vcy.VelocytoLoom("YourData.loom")
>>> vlm.normalize("S", size=True, log=True)
>>> vlm.S_norm # contains log normalized
Batch job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=Velocyto
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers velocyto.py/0.17.17-py39
velocyto run10x cellranger_count_1kpbmcs_out refdata-gex-GRCh38-2020-A/genes/genes.gtf
Velvet
Introduction
Velvet
is a sequence assembler for very short reads.
Versions
1.2.10
Commands
velveth
velvetg
Module
You can load the modules by:
module load biocontainers
module load trimmomatic
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Velvet on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=velvet
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers velvet
velveth output_directory 21 -fasta -short solexa1.fa solexa2.fa solexa3.fa -long capillary.fa
velvetg output_directory -cov_cutoff 4
Veryfasttree
Introduction
VeryFastTree is a highly-tuned implementation of the FastTree-2 tool that takes advantage of parallelization and vectorization strategies to speed up the inference of phylogenies for huge alignments. It is important to highlight that VeryFastTree keeps unchanged the phases, methods and heuristics used by FastTree-2 to estimate the phylogenetic tree. In this way, it produces trees with the same topological accuracy than FastTree-2. In addition, unlike the parallel version of FastTree-2, VeryFastTree is deterministic.
Versions
3.2.1
Commands
VeryFastTree
Module
You can load the modules by:
module load biocontainers
module load veryfasttree
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run veryfasttree on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=veryfasttree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers veryfasttree
Vg
Introduction
Variation graphs (vg) provides tools for working with genome variation graphs.
Quay.io: https://quay.io/repository/vgteam/vg?tabinfo | Home page: https://github.com/vgteam/vg
Versions
1.40.0
Commands
vg
Module
You can load the modules by:
module load biocontainers
module load vg
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vg on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vg
vg construct -r test/small/x.fa -v test/small/x.vcf.gz >x.vg
# GFA output
vg view x.vg >x.gfa
# dot output suitable for graphviz
vg view -d x.vg >x.dot
# And if you have a GAM file
cp small/x-s1337-n1.gam x.gam
# json version of binary alignments
vg view -a x.gam >x.json
vg align -s CTACTGACAGCAGAAGTTTGCTGTGAAGATTAAATTAGGTGATGCTTG x.vg
Viennarna
Introduction
Viennarna
is a set of standalone programs and libraries used for prediction and analysis of RNA secondary structures.
Versions
2.5.0
Commands
RNA2Dfold
RNALalifold
RNALfold
RNAPKplex
RNAaliduplex
RNAalifold
RNAcofold
RNAdistance
RNAdos
RNAduplex
RNAeval
RNAfold
RNAforester
RNAheat
RNAinverse
RNAlocmin
RNAmultifold
RNApaln
RNAparconv
RNApdist
RNAplex
RNAplfold
RNAplot
RNApvmin
RNAsnoop
RNAsubopt
RNAup
Kinfold
b2ct
popt
Module
You can load the modules by:
module load biocontainers
module load viennarna
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Viennarna on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=viennarna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers viennarna
RNAfold < test.seq
RNAfold -p --MEA < test.seq
Vsearch
Introduction
Vsearch
is a versatile open source tool for metagenomics.
Versions
2.19.0
2.21.1
2.22.1
Commands
vsearch
Module
You can load the modules by:
module load biocontainers
module load vsearch
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Vsearch on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vsearch
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vsearch
vsearch -sintax SRR8723605_merged.fasta -db rdp_16s_v16_sp.fa \
-tabbedout SRR8723605_out.txt -strand both -sintax_cutoff 0.5
Weblogo
Introduction
Weblogo
is a web based application designed to make the generation of sequence logos as easy and painless as possible.
Versions
3.7.8
Commands
weblogo
Module
You can load the modules by:
module load biocontainers
module load weblogo
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Weblogo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=weblogo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers weblogo
weblogo --resolution 600 --format PNG \
<seq.fasta >logo.png
Whatshap
Introduction
Whatshap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.
Versions
1.4
Commands
whatshap
Module
You can load the modules by:
module load biocontainers
module load whatshap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run whatshap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=whatshap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers whatshap
whatshap phase --indels \
--reference=reference.fasta \
variants.vcf pacbio.bam
Wiggletools
Introduction
The WiggleTools package allows genomewide data files to be manipulated as numerical functions, equipped with all the standard functional analysis operators (sum, product, product by a scalar, comparators), and derived statistics (mean, median, variance, stddev, t-test, Wilcoxon’s rank sum test, etc).
Versions
1.2.11
Commands
wiggletools
Module
You can load the modules by:
module load biocontainers
module load wiggletools
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run wiggletools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=wiggletools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers wiggletools
wiggletools test/fixedStep.wig
wiggletools test/fixedStep.bw
wiggletools test/bedfile.bg
wiggletools test/overlapping.bed
wiggletools test/bam.bam
wiggletools test/cram.cram
wiggletools test/vcf.vcf
wiggletools test/bcf.bcf
Winnowmap
Introduction
Winnowmap is a long-read mapping algorithm optimized for mapping ONT and PacBio reads to repetitive reference sequences.
Versions
2.03
Commands
winnowmap
Module
You can load the modules by:
module load biocontainers
module load winnowmap
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run winnowmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=winnowmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers winnowmap
winnowmap -W repetitive_k15.txt \
-ax map-pb Cm.contigs.fasta \
SRR3982487.fastq > output.sam
Wtdbg2
Introduction
Wtdbg2
is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT).
Versions
2.5
Commands
wtdbg-cns
wtdbg2
wtpoa-cns
Module
You can load the modules by:
module load biocontainers
module load wtdbg
Example job
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Wtdbg2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=wtdbg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers wtdbg
wtpoa-cns -t 24 -i dbg.ctg.lay.gz -fo dbg.ctg.fa