Usage¶
Samplesheet¶
advena takes a CSV samplesheet as input, with one genome per row:
| Column | Description |
|---|---|
sample |
Unique sample identifier used in all output filenames |
genbank |
Path to a GenBank file (.gb or .gbff) |
An example samplesheet is provided in
assets/samplesheet.csv.
Basic usage¶
Container engine profiles¶
Combine a container engine profile with an optional execution profile:
# Docker (local)
-profile docker
# Singularity on an HPC cluster
-profile singularity,hpc
# SLURM cluster with Singularity
-profile singularity,slurm
# Conda (local)
-profile conda
# Test dataset with Docker
-profile test,docker
Specifying the ICEscreen installation¶
advena reads the ICEscreen Python scripts and databases at runtime. There are three ways to provide the paths:
- Environment variable (recommended): Set
ICESCREEN_ROOTbefore running the pipeline.
export ICESCREEN_ROOT=/opt/icescreen
nextflow run exterex/advena --input samplesheet.csv --outdir results -profile docker
- Pipeline parameter: Pass
--icescreen_rootdirectly.
nextflow run exterex/advena \
--input samplesheet.csv \
--outdir results \
--icescreen_root /opt/icescreen \
-profile docker
- Override database path: Use
--icescreen_dbto override database path.
nextflow run exterex/advena \
--input samplesheet.csv \
--outdir results \
--icescreen_db /data/icescreen_databases \
-profile docker
Resuming a run¶
Nextflow caches completed tasks. If a run is interrupted, resume from where it left off:
nextflow run exterex/advena \
--input samplesheet.csv \
--outdir results \
-profile docker \
-resume
Adjusting BLAST parameters¶
The BLASTP search sensitivity can be tuned with --blastp_evalue and --blastp_max_target_seqs:
nextflow run exterex/advena \
--input samplesheet.csv \
--outdir results \
--blastp_evalue 1e-5 \
--blastp_max_target_seqs 5 \
-profile docker
Adjusting ME detection parameters¶
Mobile element segmentation thresholds can be adjusted for non-standard genomes: