clystere¶
clystere is a Nextflow pipeline for automated biosynthetic gene cluster (BGC) discovery and comparative analysis. It runs antiSMASH, GECCO, and deepBGC across a collection of bacterial or fungal genomes, merges overlapping predictions with comBGC, and optionally groups representative BGCs into gene cluster families (GCFs) with BiG-SCAPE or BiG-SLiCE.
Overview¶
graph LR
A[Genome assemblies<br/>samplesheet.csv] --> B[ANTISMASH<br/>per-genome]
A --> C[GECCO<br/>per-genome]
A --> D[deepBGC<br/>per-genome]
B --> E[TABULATE_REGIONS<br/>all_regions.tsv]
B --> F[COUNT_REGIONS<br/>region_counts.tsv]
B --> G[comBGC unification]
C --> G
D --> G
G --> H[BIGSCAPE dereplicate + cluster<br/>optional]
G --> I[BIGSLICE cluster<br/>optional]
Features¶
- Parallel antiSMASH + GECCO + deepBGC annotation across any number of genome assemblies or GenBank files
- comBGC-based unification of overlapping predictions before clustering
- Per-region tabulation and per-genome BGC count summary
- Optional BiG-SCAPE or BiG-SLiCE clustering (mutually exclusive)
- Optional BiG-SCAPE dereplication of redundant regions before clustering
Quick start¶
See Installation and Usage for full details.