Session 3

A Simulation-Based Sample Size Determination Package in R for Prediction Models

Louis Chen,, Illinois Math and Science Academy

Session Number

Advisor(s)

Zheyang Wu, Worcester Polytechnic Institute

Location

A129

Discipline

Computer Science

Start Date

15-4-2026 2:15 PM

End Date

15-4-2026 3:00 PM

Abstract

n predictive studies, it is important to know the amount of data needed for a given model to predict with reliable accuracy. In principle, larger datasets allow for more accurate prediction but are more expensive; knowing the minimal data needed to achieve a target level of predictive accuracy can be more cost-effective. There are currently general rules of thumb (e.g., 10 samples per parameter in the predictive model) and other criteria-based methods for determining sample size, but these do not always translate directly to the predictive performance metrics that investigators care about (e.g., AUC, MSE). This project develops an R package that estimates the minimum sample size needed to achieve a user-specified criterion in predictive performance using a general simulation-based framework. In this framework, users specify models, metrics, and data generation rules to identify the smallest sample size needed to satisfy predefined accuracy criteria. In principle, it allows users to specify complex models (e.g., decision trees, neural networks) and arbitrary metrics for realistic studies where analytical sample size calculation is difficult. We created a workflow and implemented the code as an R package, tested it on real datasets, and compared it with other methods to verify its effectiveness.

COinS

Apr 15th, 2:15 PM Apr 15th, 3:00 PM

A Simulation-Based Sample Size Determination Package in R for Prediction Models

A129

Session 3

A Simulation-Based Sample Size Determination Package in R for Prediction Models

Session Number

Advisor(s)

Location

Discipline

Start Date

End Date

Abstract

Browse

Search

Author Corner

Links

Links

Session 3

A Simulation-Based Sample Size Determination Package in R for Prediction Models

Presenter Information

Session Number

Advisor(s)

Location

Discipline

Start Date

End Date

Abstract

Share

Browse

Search

Author Corner

Links

Links