Robustness and Reliability of Boosted Decision Tree Signal Classification for Model- Independent Analysis of Dark Photon Production
Session Number
PHYS 15
Advisor(s)
Dr. Peter Dong, Illinois Mathematics and Science Academy
Discipline
Physical Science
Start Date
17-4-2025 10:15 AM
End Date
17-4-2025 10:30 AM
Abstract
The ongoing search for the dark photon conducted under the IMSA-CMS research collaboration relies on a boosted decision tree (BDT) for signal classification. BDT robustness is necessary to ensure an unbiased and model-independent search. We analyze the performance of a boosted decision tree classifier against empirical data collected by the CMS Experiment at the Large Hadron Collider and simulated data generated through Monte Carlo generation. This includes an efficiency study designed to select optimal training parameters, a cross-validation study that evaluated our BDT against several theoretical dark photon models, a study of BDT input variable consistency in reconstructed lepton jets, and detailed model selection from several promising BDT architectures. Moreover, we introduce a novel training methodology known as Cross-Sectional Adaptive Transfer Learning (CATL) that uses event cross-sections during training to assign weights that prioritize background categories with larger event yields. Based on the principles of transfer learning, CATL outperforms standard BDTs on signal efficiency while still achieving modest improvements in background rejection.
Robustness and Reliability of Boosted Decision Tree Signal Classification for Model- Independent Analysis of Dark Photon Production
The ongoing search for the dark photon conducted under the IMSA-CMS research collaboration relies on a boosted decision tree (BDT) for signal classification. BDT robustness is necessary to ensure an unbiased and model-independent search. We analyze the performance of a boosted decision tree classifier against empirical data collected by the CMS Experiment at the Large Hadron Collider and simulated data generated through Monte Carlo generation. This includes an efficiency study designed to select optimal training parameters, a cross-validation study that evaluated our BDT against several theoretical dark photon models, a study of BDT input variable consistency in reconstructed lepton jets, and detailed model selection from several promising BDT architectures. Moreover, we introduce a novel training methodology known as Cross-Sectional Adaptive Transfer Learning (CATL) that uses event cross-sections during training to assign weights that prioritize background categories with larger event yields. Based on the principles of transfer learning, CATL outperforms standard BDTs on signal efficiency while still achieving modest improvements in background rejection.