Usage

Quickstart

gunc run -i genome.fa -r /path/to/db

This will run gunc on genome.fa with outputs going to the current working directory.

Main Commands

  • gunc run The main functionality of GUNC, runs chimerism detection.

  • gunc plot Produce an interactive plot using the output from gunc run

  • gunc merge_checkm Produce a merged file combining the outputs of GUNC and checkM

  • gunc download_db Download the GUNC database (required to run gunc run)

Any of the above commands can be run with -h to get function specific information.


GUNC accepts either a progenomes or GTDB based reference database via the --db_file option. Both can be downloaded using the gunc download_db command (see below). Note that using GTDB will lead to higher resource requirements and longer run times; in accuracy benchmarks, the performance of GTDB and the default proGenome-derived GUNC database performed very similarly.

GUNC RUN

Run chimerism detection.

Required Flags

  • --db_file Path to the GUNC database file. Can be set as environment variable GUNC_DB.

One of the following is required. If flag --gene_calls is not set gene calling will be done using prodigal with option “-p meta”.

  • --input_dir Input dir with files in FASTA format.

  • --input_file Input file containing paths to FASTA format files.

  • --file_suffix Only needed if suffix of input files is not the default .fa.

  • --input_fasta Input file in FASTA format.

Optional Flags

  • --gene_calls Input is FASTA faa format genecalls.

  • --use_species_level Allow species level to be picked as maxCSS. Default: False

  • --min_mapped_genes Dont calculate GUNC score if number of mapped genes is below this value. Default: 11

  • --threads Number of CPU threads.

  • --temp_dir Directory to store temporary files. Default: Current working directory.

  • --out_dir Directory in which to put output. Default: Current working directory.

  • --sensitive Run with high sensitivity. (Uses a different cutoff to determine an abundant lineage)

  • --detailed_output Output scores for every tax_level.

  • --contig_taxonomy_output Output taxonomic assignment for each contig.


GUNC PLOT

Create interactive plot to visualise chimerism.

Required Flags

  • --diamond_file GUNC diamond outputfile. (one of the output files in diamond_output produced by gunc run)

Optional Flags

  • --gunc_gene_count_file GUNC gene_counts.json file. (Not needed if --diamond file is in the file structure made by gunc run)

  • --out_dir Output directory. Default: Current working directory.

  • --tax_levels Tax levels to display (comma-seperated). (default: kingdom,phylum,family,genus,contig)

  • --remove_minor_clade_level Tax level at which to remove minor clades. (default: kingdom)

  • --contig_display_num Number of contigs to visualise. (default: 1000, 0 plots all contigs)

  • --contig_display_list Comma seperated list of contig names to plot.


GUNC MERGE_CHECKM

Merge outputs of GUNC and checkM. Both should have been run on the same input files. CheckM qa should be run with -f qa.tsv -o 2 --tab_table parameters. If run without -o 2 the extra columns will be empty.

Required Flags

  • --gunc_file Path of gunc_scores.tsv file.

  • --checkm_file CheckM output (qa.tsv) file (run checkm qa with -o 2 --tab_table parameters).

Optional Flags

  • --out_dir Output directory. Default: Current working directory.


GUNC DOWNLOAD_DB

Required Flags

  • positional argument Download database to given directory.

Optional Flags

  • --db Which db to download (progenomes or gtdb). Default: progenomes


Special Flags

  • --version Print version number and exit.

  • --help Print help message and exit.