nf-core/taxprofiler
Highly parallelised multi-taxonomic profiling of shotgun short- and long-read metagenomic data
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples and libraries/runs.
string^\S+\.csv$Path to comma-separated file containing information about databases and profiling parameters for each taxonomic profiler
string^\S+\.csv$Specify to save decompressed user-supplied TAR archives of databases
booleanThe output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$MultiQC report title. Printed as page header, used for filename if not otherwise specified.
stringCommon options across both long and short read preprocessing QC steps
Specify to skip sequencing quality control of raw sequencing reads
booleanSpecify the tool used for quality control of raw sequencing reads
stringSave reads from samples that went through the adapter clipping, pair-merging, and length filtering steps for both short and long reads
booleanSave only the final reads from all read processing steps (that are sent to classification/profiling) in results directory.
booleanOptions for adapter clipping, quality trimming, pair-merging, and complexity filtering
Turns on short read quality control steps (adapter clipping, complexity filtering etc.)
booleanSpecify which tool to use for short-read QC
stringSkip adapter trimming
booleanSpecify adapter 1 nucleotide sequence
stringSpecify adapter 2 nucleotide sequence
stringSpecify a list of all possible adapters to trim. Overrides —shortread_qc_adapter1/2. Formats: .txt (AdapterRemoval) or .fasta. (fastp).
stringTurn on merging of read pairs for paired-end data
booleanInclude unmerged reads from paired-end merging in the downstream analysis
booleanSpecify the minimum length of reads to be retained
integer15Perform deduplication of the input reads (fastp only)
booleanTurns on nucleotide sequence complexity filtering
booleanSpecify which tool to use for complexity filtering
stringSpecify the minimum sequence entropy level for complexity filtering
number0.3Specify the window size for BBDuk complexity filtering
integer50Turn on masking rather than discarding of low complexity reads for BBduk
booleanSpecify the minimum complexity filter threshold of fastp
integer30Specify the complexity filter mode for PRINSEQ++
stringSpecify the minimum dust score for PRINTSEQ++ complexity filtering
number0.5Save reads from samples that went through the complexity filtering step
booleanOptions for adapter clipping, quality trimming, and length filtering
Turns on long read quality control steps (adapter clipping, length filtering etc.)
booleanSpecify which tool to use for adapter trimming.
stringSkip long-read trimming
booleanSpecify which tool to use for long reads filtering
stringSkip long-read length and quality filtering
booleanSpecify the minimum length of reads to be retained
integer1000Specify the percent of high-quality bases to be retained
integer90Filtlong only: specify the number of high-quality bases in the library to be retained
integer500000000Nanoq only: specify the minimum average read quality filter (Q)
integer7Estimate metagenome sequencing complexity coverage
Turn on short-read metagenome sequencing redundancy estimation with nonpareil. Warning: only use for shallow short-read sequencing datasets.
booleanSpecify mode for identifying redundant reads
stringOptions for pre-profiling host read removal
Turn on short-read host removal
booleanTurn on long-read host removal
booleanSpecify path to single reference FASTA of host(s) genome(s)
stringSpecify path to the directory containing pre-made BowTie2 indexes of the host removal reference
stringSpecify path to a pre-made Minimap2 index file (.mmi) of the host removal reference
stringSave mapping index of input reference when not already supplied by user
booleanSaved mapped and unmapped reads in BAM format from host removal
booleanSave reads from samples that went through the host-removal step
booleanOptions for per-sample run-merging
Turn on run merging
booleanSave reads from samples that went through the run-merging step
booleanTurn on profiling with Centrifuge. Requires database to be present CSV file passed to —databases
booleanTurn on saving of Centrifuge-aligned reads
booleanTurn on profiling with DIAMOND. For unmerged paired-read libraries, only read1 will be used Requires database to be present CSV file passed to —databases
booleanSpecify output format from DIAMOND profiling.
stringTurn on saving of DIAMOND-aligned reads. Will override —diamond_output_format and no taxon tables will be generated
booleanTurn on profiling with Kaiju. Requires database to be present CSV file passed to —databases
booleanTurn on expanding of virus hits to individual viruses rather than aggregating at a taxonomic level.
booleanSpecify taxonomic rank to be displayed in Kaiju taxon table
stringTurn on profiling with Kraken2. Requires database to be present CSV file passed to —databases
booleanTurn on saving of Kraken2-aligned reads
booleanTurn on saving of Kraken2 per-read taxonomic assignment file
booleanTurn on saving minimizer information in the kraken2 report thus increasing to an eight column layout.
booleanTurn on profiling with KrakenUniq. Requires one or more KrakenUniq databases to be present in the CSV file passed to —databases.
booleanTurn on saving of KrakenUniq (un-)classified reads as FASTA.
booleanSpecify a RAM chunk size for all KrakenUniq databases when loading into memory when you want to load via chunks. Specify in --databases for per-database values.
stringTurn on saving of KrakenUniq per-read taxonomic assignment file.
booleanSpecify the number of samples for each KrakenUniq run.
integer20Turn on Bracken (and the required Kraken2 prerequisite step).
booleanTurn on the saving of the intermediate Kraken2 files used as input to Bracken itself into Kraken2 results folder
booleanTurn on profiling with MALT. Requires database to be present CSV file passed to —databases
booleanSpecify which MALT alignment mode to use
stringTurn on saving of MALT-aligned reads
booleanTurn on generation of MEGAN summary file from MALT results
booleanTurn on profiling with MetaPhlAn. Requires database to be present CSV file passed to —databases
booleanTurn on saving of MetaPhlAn reads aligned against marker genes in SAM format
booleanTurn on profiling with mOTUs. Requires database to be present CSV file passed to —databases
booleanTurn on printing relative abundance instead of counts.
booleanTurn on saving the mgc reads count.
booleanTurn on removing NCBI taxonomic IDs.
booleanTurn on classification with KMCP.
booleanTurn on saving the output of KMCP search
booleanTurn on profiling with ganon. Requires database to be present CSV file passed to —databases.
booleanTurn on saving of ganon per-read taxonomic assignment file(s).
booleanSpecify the type of ganon report to save.
stringSpecify the taxonomic report the ganon report file should display.
stringdefaultSpecify a percentile within which hits will be reported in ganon report output..
integerSpecify a minimum number of reads a hit must have to be retained in the ganon report.
integerSpecify a maximum number of reads a hit must have to be retained in the ganon report.
integerTurn on standardisation of taxon tables across profilers
booleanTurn on generation of BIOM output (currently only applies to mOTUs)
booleanTurn on generation of Krona plots for supported profilers
booleanSpecify path to krona taxonomy directories (required for MALT krona plots)
stringThe desired output format.
stringThe path to a directory containing taxdump files.
stringAdd the taxon name to the output. Requires —taxpasta_taxonomy_dir.
booleanAdd the taxon rank to the output. Requires —taxpasta_taxonomy_dir.
booleanAdd the taxon’s entire name lineage to the output. Requires —taxpasta_taxonomy_dir.
booleanAdd the taxon’s entire ID lineage to the output. Requires —taxpasta_taxonomy_dir.
booleanAdd the taxon’s entire rank lineage to the output. Requires —taxpasta_taxonomy_dir.
booleanIgnore individual profiles that cause errors.
booleanParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional config name.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringLess common options for the pipeline, typically set in a config file.
Display version and exit.
booleanMethod used to save pipeline results to output directory.
stringEmail address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanFile size limit when attaching MultiQC reports to summary emails.
string25.MB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Do not use coloured log outputs.
booleanIncoming hook URL for messaging service
stringCustom config file to supply to MultiQC.
stringCustom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
stringCustom MultiQC yaml file containing HTML including a methods description.
stringBoolean whether to validate parameters against the schema at runtime
booleantrueBase URL or local path to location of pipeline test dataset files
stringhttps://raw.githubusercontent.com/nf-core/test-datasets/Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.
string