nf-core/proteinannotator
The best protein annotation pipeline in the world. Protein fasta → ??? → Annotations
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples in the experiment.
string^\S+\.csv$The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$MultiQC report title. Printed as page header, used for filename if not otherwise specified.
stringOptions for best-practices domain annotation tool from EBI, InterProScan
Run InterProScan
booleanTar.gz file exactly from InterProScan https://www.ebi.ac.uk/interpro/download/InterProScan/
stringhttps://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.74-105.0/interproscan-5.74-105.0-64-bit.tar.gzPath to the InterProScan database, as downloaded from https://www.ebi.ac.uk/interpro/download/InterProScan/, uncompressed, and the /data subfolder
stringVersion number of the InterProScan database, e.g. “5.73-104.0”
stringParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional config name.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringLess common options for the pipeline, typically set in a config file.
Display version and exit.
booleanMethod used to save pipeline results to output directory.
stringEmail address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanFile size limit when attaching MultiQC reports to summary emails.
string25.MB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Do not use coloured log outputs.
booleanIncoming hook URL for messaging service
stringCustom config file to supply to MultiQC.
stringCustom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
stringCustom MultiQC yaml file containing HTML including a methods description.
stringBoolean whether to validate parameters against the schema at runtime
booleantrueBase URL or local path to location of pipeline test dataset files
stringhttps://raw.githubusercontent.com/nf-core/test-datasets/Base URL or local path to location of modules test dataset files
stringSuffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.
stringDisplay the help message.
boolean,stringDisplay the full detailed help message.
booleanDisplay hidden parameters in the help message (only works when —help or —help_full are provided).
booleanUse these parameters to control the flow of the quality check subworkflow execution.
Skip all default QC steps for sequences (gap trimming, length filtering, validation, duplicate removal).
booleanThe minimum allowed sequence length
integer30The maximum allowed sequence length
integer5000Remove duplicate input amino acid sequences, based on the sequence.
booleanUse these parameters to control the flow of the domain annotation execution.
Skip the domain annotation with the Pfam database.
booleanPath to an already installed Pfam HMM database (.hmm.gz).
stringInterPro hosted link to the latest Pfam HMM database file.
stringhttps://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.hmm.gzhmmsearch e-value cutoff threshold for reported results
number0.001Use these parameters to control the flow of the secondary structure prediction execution.
Skip the secondary structure prediction.
booleanChoose the output format (i.e., ‘ss2’, ‘fas’, ‘horiz’) for the s4pred per amino acid probability predictions (i.e., α-helix, β-strand, coil).
string