© Author(s) 2022. This work is distributed
under "MIT Copyright (c) 2022 Christiane Hassenrueck"

Amplicon sequence processing workflow (paired-end mixed-orientation libraries generated at LGC Genomics)

Hassenrück, Christiane

Abstract. This workflow (written in snakemake) covers the following steps in the analysis of amplicon sequencing data (also called metabarcoding): primer clipping, quality trimming and filtering, denoising, merging of paired-end reads, chimera removal, taxonomic classification, generation of a sample-by-OTU table. The workflow is optimized for sequences generated from mixed-orientation libraries at LGC Genomics. The input should not contain sequencing adapters and sample identifiers (indices) anymore, i.e. AdapterClipped data provided by LGC. OTUs (operational taxonomic units) are defined as amplicon sequence variants. The workflow is following the DADA2 pipeline (Callahan et al. 2016) for amplicon sequence data analysis. Other programs used in the workflow are: cutadapt, R, blastn and GNU parallel. When using the workflow, please make sure to cite all dependecies. More information is available on the wiki for the workflow on and included in this data package. This DOI represents the first release of the workflow. The latest version is available on


Full Article or Dataset