top of page
Search

Poster #43 - Serin Jo

  • vitod24
  • Oct 20
  • 2 min read

Design of a Lightweight Algorithm for Robust Mass Spectrometry Peak Detection


Jo, Serin, Roslyn High School, Roslyn Heights, NY, USA Li, Guangyuan, Ph.D., Perlmutter Cancer Center, NYU Langone Health, New York, NY, USA


Mass spectrometry-based proteomics remains the predominant approach for identifying and quantifying proteins and peptides across a wide range of applications. The preprocessing of raw MS data is an important step for accurate downstream analysis. One challenge in preprocessing is accurate peak detection, also known as centroiding, converting raw noisy profile data into centroided peaks, thus facilitating the following peptide spectrum matching steps. Unfortunately, current centroiding is still largely dependent on proprietary "black box" software packages that are costly, closed source, and that limit reproducibility between workflows. To systematically identify the best approach, I coded six novel centroiding approaches, designed by combining different parts of simpler solutions. Using immunopeptidome data obtained from an Orbitrap mass spectrometer, I implemented signal to noise (SNR) thresholding, area under the curve (AUC) thresholding using Simpson's rule, local maxima detection, Gaussian fitting (correlation and convolution), and Mexican hat fitting. Each method was benchmarked against a ground truth-centroiding performed by the proprietary software Bruker and the original peptide sequence of samples. The results showed that traditional thresholding methods (SNR, AUC) achieved moderate performance (F1 = 0.60), while local maximum detection performed slightly better (F1 = 0.70). Gaussian convolution improved noise handling (F1 = 0.65), though Gaussian correlation struggled with peak asymmetry (F1 = 0.30). The method that performed the best was Mexican hat convolution (F1 = 0.74), which outperformed all other strategies by combining noise suppression with accurate peak localization. These findings suggest that wavelet-based convolution offers a lightweight and modular solution for peak detection, especially in complex datasets where overlapping peaks and baseline drift are present. Together, my study developed an open-source and light-weight peak picking algorithm that has comparable performance with proprietary solutions. I anticipate the wide integration of this lightweight implementation into various downstream MS workflows that can simplify tedious preprocessing steps.

 
 
 

Recent Posts

See All
Poster #9 - Yuheng Du

Cell-Type-Resolved Placental Epigenomics Identifies Clinically Distinct Subtypes of Preeclampsia Yuheng Du, Ph.D. Student, Department of Computational Medicine and Bioinformatics, University of Michig

 
 
 
Poster #15 - Jiayi Xin

Interpretable Multimodal Interaction-aware Mixture-of-Experts Jiayi Xin, BS, PhD Student, University of Pennsylvania, PA, USA Sukwon Yun, MS, PhD Student, University of North Carolina at Chapel Hil

 
 
 
Poster #14 - Aditya Shah

Tumor subtype and clinical factors mediate the impact of tumor PPARɣ expression on outcomes in patients with primary breast cancer. Aditya Shah1,2, Katie Liu1,3, Ryan Liu1, 4, Gautham Ramshankar1, Cur

 
 
 

Comments


bottom of page