Robustness and Reliability of Boosted Decision Tree Signal Classification for Model- Independent Analysis of Dark Photon Production

Session Number

PHYS 15

Advisor(s)

Dr. Peter Dong, Illinois Mathematics and Science Academy

Discipline

Physical Science

Start Date

17-4-2025 10:15 AM

End Date

17-4-2025 10:30 AM

Abstract

The ongoing search for the dark photon conducted under the IMSA-CMS research collaboration relies on a boosted decision tree (BDT) for signal classification. BDT robustness is necessary to ensure an unbiased and model-independent search. We analyze the performance of a boosted decision tree classifier against empirical data collected by the CMS Experiment at the Large Hadron Collider and simulated data generated through Monte Carlo generation. This includes an efficiency study designed to select optimal training parameters, a cross-validation study that evaluated our BDT against several theoretical dark photon models, a study of BDT input variable consistency in reconstructed lepton jets, and detailed model selection from several promising BDT architectures. Moreover, we introduce a novel training methodology known as Cross-Sectional Adaptive Transfer Learning (CATL) that uses event cross-sections during training to assign weights that prioritize background categories with larger event yields. Based on the principles of transfer learning, CATL outperforms standard BDTs on signal efficiency while still achieving modest improvements in background rejection.

Share

COinS
 
Apr 17th, 10:15 AM Apr 17th, 10:30 AM

Robustness and Reliability of Boosted Decision Tree Signal Classification for Model- Independent Analysis of Dark Photon Production

The ongoing search for the dark photon conducted under the IMSA-CMS research collaboration relies on a boosted decision tree (BDT) for signal classification. BDT robustness is necessary to ensure an unbiased and model-independent search. We analyze the performance of a boosted decision tree classifier against empirical data collected by the CMS Experiment at the Large Hadron Collider and simulated data generated through Monte Carlo generation. This includes an efficiency study designed to select optimal training parameters, a cross-validation study that evaluated our BDT against several theoretical dark photon models, a study of BDT input variable consistency in reconstructed lepton jets, and detailed model selection from several promising BDT architectures. Moreover, we introduce a novel training methodology known as Cross-Sectional Adaptive Transfer Learning (CATL) that uses event cross-sections during training to assign weights that prioritize background categories with larger event yields. Based on the principles of transfer learning, CATL outperforms standard BDTs on signal efficiency while still achieving modest improvements in background rejection.