Evaluating Training Methods for Energy-Based Models*
Session Number
1
Advisor(s)
Dr. Yixuan Sun, Argonne National Laboratory
Location
A113
Discipline
Computer Science
Start Date
15-4-2026 10:15 AM
End Date
15-4-2026 11:00 AM
Abstract
Energy-Based Models (EBMs) are a generative machine learning framework that can be applied to many types of data. They work by learning the energy function in a Boltzmann distribution, which measures the compatibility between the input and the target data distribution. Training EBMs probabilistically can be difficult since it requires calculating the partition function, which integrates over all possible inputs, making it computationally expensive and often intractable. In this study, we explore methods of training EBMs to circumvent explicitly calculating the partition function. The three methods used in this study are Contrastive Divergence (CD), Denoising Score Matching (DSM), and Noise Contrastive Estimation (NCE). These three methods were used to train an EBM on two datasets: an 8-Gaussian distribution and the half-moon dataset. We evaluate each model’s performance by generating new samples through Langevin Dynamics and comparing them with dataset samples using visual and Wasserstein distance metrics.
Evaluating Training Methods for Energy-Based Models*
A113
Energy-Based Models (EBMs) are a generative machine learning framework that can be applied to many types of data. They work by learning the energy function in a Boltzmann distribution, which measures the compatibility between the input and the target data distribution. Training EBMs probabilistically can be difficult since it requires calculating the partition function, which integrates over all possible inputs, making it computationally expensive and often intractable. In this study, we explore methods of training EBMs to circumvent explicitly calculating the partition function. The three methods used in this study are Contrastive Divergence (CD), Denoising Score Matching (DSM), and Noise Contrastive Estimation (NCE). These three methods were used to train an EBM on two datasets: an 8-Gaussian distribution and the half-moon dataset. We evaluate each model’s performance by generating new samples through Langevin Dynamics and comparing them with dataset samples using visual and Wasserstein distance metrics.