Quasi-Newton methods for minimizing a quadratic function
Anders Forsgren
KTH Royal Institute of Technology
Abstract:
We discuss quasi-Newton methods for minimizing a strictly convex quadratic function that may be subject to errors in the evaluation of the gradients. The methods all give identical behavior in exact arithmetic, generating minimizers of Krylov subspaces of increasing dimensions, thereby having finite termination. In exact arithmetic, the method of conjugate gradients gives identical iterates and has less computational cost.
In a framework of small errors, e.g., finite precision arithmetic, the performance of the method of conjugate gradients is expected to deteriorate, whereas a BFGS quasi-Newton method is empirically known to behave very well. We discuss the behavior of limited-memory quasi-Newton methods, balancing the good performance of a BFGS method to the low computational cost of the method of conjugate gradients.
We also discuss large-error scenarios, in which the expected behavior is not so clear. In particular, we are interested in the behavior of quasi-Newton matrices that differ from the identity by a low-rank matrix, such as a memoryless BFGS method. Our numerical results indicate that for large errors, a memory-less quasi-Newton method often outperforms a BFGS method. We also consider a more advanced model for generating search directions, based on solving a chance-constrained optimization problem. Our results indicate that such a model often gives a slight advantage in final accuracy, although the computational cost is significantly higher.
The talk is based on joint work with Tove Odland, David Ek, Gianpiero Canessa and Shen Peng.
Tuesday, October 26, 2021
11:00AM Zoom ID 970 1854 2148