taxprofiler: Parameters

Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples and libraries/runs.

required

type: string

pattern: ^\S+\.csv$

Path to comma-separated file containing information about databases and profiling parameters for each taxonomic profiler

required

type: string

pattern: ^\S+\.csv$

Specify to save decompressed user-supplied TAR archives of databases

type: boolean

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Common options across both long and short read preprocessing QC steps

Specify to skip sequencing quality control of raw sequencing reads

type: boolean

Specify the tool used for quality control of raw sequencing reads

type: string

Save reads from samples that went through the adapter clipping, pair-merging, and length filtering steps for both short and long reads

type: boolean

Save only the final reads from all read processing steps (that are sent to classification/profiling) in results directory.

type: boolean

Options for adapter clipping, quality trimming, pair-merging, and complexity filtering

Turns on short read quality control steps (adapter clipping, complexity filtering etc.)

type: boolean

Specify which tool to use for short-read QC

type: string

Skip adapter trimming

type: boolean

Specify adapter 1 nucleotide sequence

type: string

Specify adapter 2 nucleotide sequence

type: string

Specify a list of all possible adapters to trim. Overrides —shortread_qc_adapter1/2. Formats: .txt (AdapterRemoval) or .fasta. (fastp).

type: string

Turn on merging of read pairs for paired-end data

type: boolean

Include unmerged reads from paired-end merging in the downstream analysis

type: boolean

Specify the minimum length of reads to be retained

type: integer

default: 15

Perform deduplication of the input reads (fastp only)

type: boolean

Turns on nucleotide sequence complexity filtering

type: boolean

Specify which tool to use for complexity filtering

type: string

Specify the minimum sequence entropy level for complexity filtering

type: number

default: 0.3

Specify the window size for BBDuk complexity filtering

type: integer

default: 50

Turn on masking rather than discarding of low complexity reads for BBduk

type: boolean

Specify the minimum complexity filter threshold of fastp

type: integer

default: 30

Specify the complexity filter mode for PRINSEQ++

type: string

Specify the minimum dust score for PRINTSEQ++ complexity filtering

type: number

default: 0.5

Save reads from samples that went through the complexity filtering step

type: boolean

Options for adapter clipping, quality trimming, and length filtering

Turns on long read quality control steps (adapter clipping, length filtering etc.)

type: boolean

Specify which tool to use for adapter trimming.

type: string

Skip long-read trimming

type: boolean

Specify which tool to use for long reads filtering

type: string

Skip long-read length and quality filtering

type: boolean

Specify the minimum length of reads to be retained

type: integer

default: 1000

Specify the percent of high-quality bases to be retained

type: integer

default: 90

Filtlong only: specify the number of high-quality bases in the library to be retained

type: integer

default: 500000000

Nanoq only: specify the minimum average read quality filter (Q)

type: integer

default: 7

Estimate metagenome sequencing complexity coverage

Turn on short-read metagenome sequencing redundancy estimation with nonpareil. Warning: only use for shallow short-read sequencing datasets.

type: boolean

Specify mode for identifying redundant reads

type: string

Options for pre-profiling host read removal

Turn on short-read host removal

type: boolean

Turn on long-read host removal

type: boolean

Specify path to single reference FASTA of host(s) genome(s)

type: string

Specify path to the directory containing pre-made BowTie2 indexes of the host removal reference

type: string

Specify path to a pre-made Minimap2 index file (.mmi) of the host removal reference

type: string

Save mapping index of input reference when not already supplied by user

type: boolean

Saved mapped and unmapped reads in BAM format from host removal

type: boolean

Save reads from samples that went through the host-removal step

type: boolean

Options for per-sample run-merging

Turn on run merging

type: boolean

Save reads from samples that went through the run-merging step

type: boolean

Turn on profiling with Centrifuge. Requires database to be present CSV file passed to —databases

type: boolean

Turn on saving of Centrifuge-aligned reads

type: boolean

Turn on profiling with DIAMOND. For unmerged paired-read libraries, only read1 will be used Requires database to be present CSV file passed to —databases

type: boolean

Specify output format from DIAMOND profiling.

type: string

Turn on saving of DIAMOND-aligned reads. Will override —diamond_output_format and no taxon tables will be generated

type: boolean

Turn on profiling with Kaiju. Requires database to be present CSV file passed to —databases

type: boolean

Turn on expanding of virus hits to individual viruses rather than aggregating at a taxonomic level.

type: boolean

Specify taxonomic rank to be displayed in Kaiju taxon table

type: string

Turn on profiling with Kraken2. Requires database to be present CSV file passed to —databases

type: boolean

Turn on saving of Kraken2-aligned reads

type: boolean

Turn on saving of Kraken2 per-read taxonomic assignment file

type: boolean

Turn on saving minimizer information in the kraken2 report thus increasing to an eight column layout.

type: boolean

Turn on profiling with KrakenUniq. Requires one or more KrakenUniq databases to be present in the CSV file passed to —databases.

type: boolean

Turn on saving of KrakenUniq (un-)classified reads as FASTA.

type: boolean

Specify a RAM chunk size for all KrakenUniq databases when loading into memory when you want to load via chunks. Specify in --databases for per-database values.

type: string

Turn on saving of KrakenUniq per-read taxonomic assignment file.

type: boolean

Specify the number of samples for each KrakenUniq run.

type: integer

default: 20

Turn on Bracken (and the required Kraken2 prerequisite step).

type: boolean

Turn on the saving of the intermediate Kraken2 files used as input to Bracken itself into Kraken2 results folder

type: boolean

Turn on profiling with MALT. Requires database to be present CSV file passed to —databases

type: boolean

Specify which MALT alignment mode to use

type: string

Turn on saving of MALT-aligned reads

type: boolean

Turn on generation of MEGAN summary file from MALT results

type: boolean

Turn on profiling with MetaPhlAn. Requires database to be present CSV file passed to —databases

type: boolean

Turn on saving of MetaPhlAn reads aligned against marker genes in SAM format

type: boolean

Turn on profiling with mOTUs. Requires database to be present CSV file passed to —databases

type: boolean

Turn on printing relative abundance instead of counts.

type: boolean

Turn on saving the mgc reads count.

type: boolean

Turn on removing NCBI taxonomic IDs.

type: boolean

Turn on classification with KMCP.

type: boolean

Turn on saving the output of KMCP search

type: boolean

Turn on profiling with ganon. Requires database to be present CSV file passed to —databases.

type: boolean

Turn on saving of ganon per-read taxonomic assignment file(s).

type: boolean

Specify the type of ganon report to save.

type: string

Specify the taxonomic report the ganon report file should display.

type: string

default: default

Specify a percentile within which hits will be reported in ganon report output..

type: integer

Specify a minimum number of reads a hit must have to be retained in the ganon report.

type: integer

Specify a maximum number of reads a hit must have to be retained in the ganon report.

type: integer

Turn on standardisation of taxon tables across profilers

type: boolean

Turn on generation of BIOM output (currently only applies to mOTUs)

type: boolean

Turn on generation of Krona plots for supported profilers

type: boolean

Specify path to krona taxonomy directories (required for MALT krona plots)

type: string

The desired output format.

type: string

The path to a directory containing taxdump files.

type: string

Add the taxon name to the output. Requires —taxpasta_taxonomy_dir.

type: boolean

Add the taxon rank to the output. Requires —taxpasta_taxonomy_dir.

type: boolean

Add the taxon’s entire name lineage to the output. Requires —taxpasta_taxonomy_dir.

type: boolean

Add the taxon’s entire ID lineage to the output. Requires —taxpasta_taxonomy_dir.

type: boolean

Add the taxon’s entire rank lineage to the output. Requires —taxpasta_taxonomy_dir.

type: boolean

Ignore individual profiles that cause errors.

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

nf-core/taxprofiler