#### Session Number

Project ID: MATH 03

#### Advisor(s)

Dr. Evan Glazer, Illinois Mathematics and Science Academy

#### Discipline

Mathematics

#### Start Date

20-4-2022 9:10 AM

#### End Date

20-4-2022 9:25 AM

#### Abstract

Motivated by applications as a kernel of nonlinear regression algorithms, the row-wise weighted total least squares regression problem is examined to find a consistent and accurate estimator. Specifically, the estimator will have a time complexity linear in the number of observations and a space complexity constant in the same value, as the number of observations can be quite large in many modern applications, often many orders of magnitude larger than the number of input and output features. Further, to accommodate large data sets, an algorithm is sought to update an intermediate representation from each observation, allowing for parallelization of the necessary computation. The proposed method is based on approximating the noncentral second moment of the underlying data by a precision-weighted mean, requiring only linear time in the number of observations. Initial findings show the proposed algorithm to be less accurate than existing methods intended to solve other variants of the Total Least Squares problem. Directions for continued iteration and further investigation are proposed as next steps in developing a better algorithm.

Approximating the Row-Wise Total Least Squares Linear Regression Solution

Motivated by applications as a kernel of nonlinear regression algorithms, the row-wise weighted total least squares regression problem is examined to find a consistent and accurate estimator. Specifically, the estimator will have a time complexity linear in the number of observations and a space complexity constant in the same value, as the number of observations can be quite large in many modern applications, often many orders of magnitude larger than the number of input and output features. Further, to accommodate large data sets, an algorithm is sought to update an intermediate representation from each observation, allowing for parallelization of the necessary computation. The proposed method is based on approximating the noncentral second moment of the underlying data by a precision-weighted mean, requiring only linear time in the number of observations. Initial findings show the proposed algorithm to be less accurate than existing methods intended to solve other variants of the Total Least Squares problem. Directions for continued iteration and further investigation are proposed as next steps in developing a better algorithm.