A damage-aware NGS workflow for conservative species identification from ultra-degraded DNA.
Morelli Stefania S, Romano Sara S, Cosenza Giulia G, Abate Sergio S et al.
Species identification from highly degraded DNA remains a major challenge across ecology, conservation genetics, wildlife forensics, and museum science, where samples are often scarce, contaminated, and embedded in complex matrices. Under these conditions, standard reference-based and metagenomic classifiers are prone to false-positive assignments, particularly when ultra-fragmented DNA and conserved genomic regions are not explicitly accounted for. Here, we present a damage-aware next-generation sequencing (NGS) workflow for conservative species identification from minute quantities of highly degraded DNA, designed to minimize misclassification in low-input and damage-rich datasets. The workflow integrates micro-sampling, half-uracil-DNA-glycosylase (half-UDG) library preparation, PCR duplicate removal, multi-genome mapping against a curated reference panel, and a post-mapping read-ubiquity classifier that distinguishes species-specific reads from those shared across conserved loci. Using collagen-rich substrates as a proof-of-concept, we demonstrated accurate species attribution from samples as small as 1 mm2, including mixtures and mineral-containing matrices. The workflow reliably identifies dominant biological sources, reduces false-positive assignments driven by conserved genomic regions, and remains robust to common physical and chemical treatments such as swelling, heating, and plaster addition. Overall, this study provides a proof-of-concept framework for conservative species identification in challenging degraded DNA contexts. The workflow may be adaptable to a broader range of degraded DNA contexts-including wildlife monitoring, regulatory enforcement, forensic investigations, and the analysis of processed biological materials-although further validation across diverse matrices will be required.