Output¶

All results are written to --outdir (default: results). The directory structure is:

results/
├── antismash/
│   ├── <sample>/           # one directory per input genome
│   │   ├── <sample>.json   # antiSMASH JSON (parsed by tabulation scripts)
│   │   ├── <sample>.gbk    # annotated GenBank output
│   │   └── ...
│   └── ...
├── gecco/
│   ├── <sample>/           # GECCO output directory per genome
│   │   ├── *.clusters.tsv
│   │   ├── *.features.tsv
│   │   └── *.region*.gbk   # generated when BiG-SCAPE or BiG-SLiCE is enabled
│   └── ...
├── deepbgc/
│   ├── <sample>/           # deepBGC output directory per genome
│   │   ├── *.bgc.tsv
│   │   ├── *.full.gbk
│   │   └── *.region*.gbk   # converted to antiSMASH-like region files
│   └── ...
├── combgc/
│   ├── <sample>/
│   │   ├── combgc_summary.tsv
│   │   └── combined_regions/  # representative non-redundant region GBKs used for clustering
│   └── ...
├── bigscape/               # only when --bigscape_run
│   └── output_files/
│       ├── *.network       # GCF network files (one per class + mix)
│       ├── *.tsv           # cluster annotation tables
│       └── ...
├── bigslice/               # only when --bigslice_run
│   ├── result/             # SQLite database and analysis outputs
│   └── ...
├── summary/
│   ├── all_regions.tsv     # per-BGC-region table
│   └── region_counts.tsv   # per-genome BGC count table
└── pipeline_info/
    ├── execution_timeline_<timestamp>.html
    ├── execution_report_<timestamp>.html
    ├── execution_trace_<timestamp>.txt
    └── pipeline_dag_<timestamp>.html

`summary/all_regions.tsv`¶

One row per biosynthetic region across all input genomes. Produced by TABULATE_REGIONS.

Column	Description
`file`	Input genome identifier (stem of the antiSMASH JSON filename)
`record_id`	Sequence record name (e.g. NCBI accession)
`region`	Region number within the record
`start`	Start coordinate (bp, 0-based)
`end`	End coordinate (bp)
`contig_edge`	`True` if the region extends to the edge of the contig (potentially truncated)
`product`	BGC product class(es), separated by `/` for hybrids
`knownclusterblast_hit`	Top KnownClusterBlast hit name (only present when `--antismash_cb_knownclusters`)
`knownclusterblast_accession`	MIBiG accession of the top hit
`knownclusterblast_similarity`	Similarity category: `high` (>75 %), `medium` (>50 %), or `low` (>15 %)
`record_desc`	Full sequence record description

!!! note The three knownclusterblast_* columns are only included when --antismash_cb_knownclusters is set.

`summary/region_counts.tsv`¶

One row per genome assembly (or contig when --count_per_contig). Produced by COUNT_REGIONS.

Column	Description
`record`	Genome assembly name (or `genome\\|contig` when `--count_per_contig`)
`total_count`	Total number of BGC regions regardless of class
`<bgc_type>`	Count of regions of that product type (one column per distinct type observed across all genomes)
`hybrid`	Count of multi-product regions (absent when `--split_hybrids`)
`description`	Sequence description; includes `[N total records]` suffix for multi-contig assemblies

`antismash/<sample>/`¶

Raw antiSMASH output for each genome. The key file is <sample>.json, which is read by the tabulation scripts. Other files (HTML, GenBank, SVG plots) are present depending on the antiSMASH flags used.

`gecco/<sample>/`¶

Raw GECCO output for each genome. When --bigscape_run or --bigslice_run is enabled, clystere also runs gecco convert gbk --format bigslice and publishes GECCO region GenBank files compatible with clustering tools.

`deepbgc/<sample>/`¶

Raw deepBGC output for each genome (*.bgc.tsv and *.full.gbk). clystere also generates antiSMASH-like *.regionNNN.gbk files using BiG-SLiCE's conversion script to ensure compatibility with BiG-SCAPE/BiG-SLiCE.

`combgc/<sample>/`¶

Unified per-sample BGC selection generated from antiSMASH, GECCO, and deepBGC predictions. The combined_regions/ directory contains representative region GenBank files used as clustering input.

`bigscape/output_files/`¶

Standard BiG-SCAPE output. The .network files are tab-separated edge lists suitable for import into Cytoscape or Python networkx. One network is generated per BGC class plus a mix network when --bigscape_mix (default).

`bigslice/`¶

Standard BiG-SLiCE output generated from the unified combgc/*/combined_regions/ folders. The directory includes the processed analysis results and SQLite-backed data used for downstream inspection.

`pipeline_info/`¶

Nextflow execution metadata generated per run:

File	Description
`execution_timeline_*.html`	Gantt chart of process execution times
`execution_report_*.html`	Resource usage report (CPU, memory, I/O per process)
`execution_trace_*.txt`	Raw per-task resource trace (parseable TSV)
`pipeline_dag_*.html`	Directed acyclic graph of the pipeline

Output¶

summary/all_regions.tsv¶

summary/region_counts.tsv¶

antismash/<sample>/¶

gecco/<sample>/¶

deepbgc/<sample>/¶

combgc/<sample>/¶

bigscape/output_files/¶

bigslice/¶

pipeline_info/¶