Learning curves for the multi-class teacher–student perceptron

Cornacchia, Elisabetta and Mignacco, Francesca and Veiga, Rodrigo and Gerbelot, Cédric and Loureiro, Bruno and Zdeborová, Lenka (2023) Learning curves for the multi-class teacher–student perceptron. Machine Learning: Science and Technology, 4 (1). 015019. ISSN 2632-2153

[thumbnail of Cornacchia_2023_Mach._Learn.__Sci._Technol._4_015019.pdf]

Text
Cornacchia_2023_Mach._Learn.__Sci._Technol._4_015019.pdf - Published Version
Download (1MB)

Official URL: https://doi.org/10.1088/2632-2153/acb428

Abstract

One of the most classical results in high-dimensional learning theory provides a closed-form expression for the generalisation error of binary classification with a single-layer teacher–student perceptron on i.i.d. Gaussian inputs. Both Bayes-optimal (BO) estimation and empirical risk minimisation (ERM) were extensively analysed in this setting. At the same time, a considerable part of modern machine learning practice concerns multi-class classification. Yet, an analogous analysis for the multi-class teacher–student perceptron was missing. In this manuscript we fill this gap by deriving and evaluating asymptotic expressions for the BO and ERM generalisation errors in the high-dimensional regime. For Gaussian teacher, we investigate the performance of ERM with both cross-entropy and square losses, and explore the role of ridge regularisation in approaching Bayes-optimality. In particular, we observe that regularised cross-entropy minimisation yields close-to-optimal accuracy. Instead, for Rademacher teacher we show that a first-order phase transition arises in the BO performance.

Item Type:	Article
Subjects:	Archive Digital > Multidisciplinary
Depositing User:	Unnamed user with email support@archivedigit.com
Date Deposited:	15 Jul 2023 06:28
Last Modified:	06 Nov 2023 05:10
URI:	http://eprints.ditdo.in/id/eprint/1342

Actions (login required)

: View Item