(109d) Block Optical DNA Sequencing and Content Scoring for Rapid Genetic Biomarker Identification

Korshoj, L., University of Colorado Boulder
Nagpal, P., University of Colorado Boulder
A single, inexpensive diagnostic test capable of rapidly identifying a wide range of genetic biomarkers would prove invaluable in precision medicine. Immediate applications of such a technology would be to address the growing threat of antibiotic resistance and in the screening of genetic diseases such as cancer. Current antibiotic resistance diagnostics and profiling assays are often performed only after initial antibiotics fail, and most rely on cell culturing, PCR amplification, and microarray analyses. For cancers or other genetic diseases, diagnostics continue to rely on the identification of proteins, peptides, or gene expression levels. Many of these assays, whether for antibiotic resistance or genetic diseases, are not only time consuming but specific for detecting merely one or a few well-characterized biomarkers. In a push for improved precision diagnostics, we have developed an optical technique for high-throughput, label-free detection of A-G-C-T content in DNA k-mers, called block optical sequencing (BOS) [1]. We demonstrate that these optical measurements of DNA k-mer content can be applied for rapid, broad-spectrum genetic biomarker identification though our robust algorithmic platform called block optical content scoring (BOCS).

The BOS technique uses surface-enhanced Raman spectroscopy for label-free identification of DNA nucleobases from multiplexed three-dimensional plasmonic nanofocusing and characterization of molecular vibrations within the fingerprinting region of ~400-1400 cm-1. While nanometer-scale mode volumes prevent resolution of single nucleobases, our block optical technique can identify relative A-G-C-T content in DNA k-mers (where k ~10 nucleotides). Our robust bioinformatics algorithm, BOCS, uses content-based sequence alignment for probabilistic mapping of k-mer content measurements to gene sequences within a biomarker database, resulting in a probability ranking of genes on a content score. Simulations of the BOCS algorithm reveal high accuracy for identification of single antibiotic resistance genes from pathogens such as methicillin-resistant Staphylococcus aureus (MRSA) even in the presence of significant sequencing errors (100% accuracy for no sequencing errors, and > 90% accuracy for sequencing errors at 20%), and at well below full coverage of the genes. Extension of BOCS to cancer and other genetic diseases met or exceeded the results for resistance genes.

These results pave the way for a novel, high-throughput sequencing method with inherent lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. Combined with the BOCS algorithm, this potentiates a test capable of rapid diagnosis and profiling of genetic biomarkers ranging from antibiotic resistance to cancer and other genetic diseases.

[1] Korshoj, Sagar, Hanson, Chowdhury, Otoupal, Chatterjee, Nagpal, Small 14 (4), 1703165 (2018).