Information-theoretic generalization bounds for meta-learning and applications

Sharu Theresa Jose; Osvaldo Simeone

doi:10.3390/e23010126

Information-theoretic generalization bounds for meta-learning and applications

Sharu Theresa Jose^*, Osvaldo Simeone

^*Corresponding author for this work

Computer Science

Research output: Contribution to journal › Article › peer-review

4 Downloads (Pure)

Abstract

Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.

Original language	English
Article number	126
Number of pages	28
Journal	Entropy
Volume	23
Issue number	1
DOIs	https://doi.org/10.3390/e23010126
Publication status	Published - 19 Jan 2021

Keywords

meta-learning
generalization bounds
mutual information
noisy iterative algorithms

Access to Document

10.3390/e23010126Licence: Creative Commons: Attribution (CC BY)

JoseS2021InformationFinal published version, 775 KBLicence: Creative Commons: Attribution (CC BY)

Cite this

@article{df23fed158a54d37bbc29dc4c6aec7dc,

title = "Information-theoretic generalization bounds for meta-learning and applications",

abstract = "Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.",

keywords = "meta-learning, generalization bounds, mutual information, noisy iterative algorithms",

author = "Jose, {Sharu Theresa} and Osvaldo Simeone",

year = "2021",

month = jan,

day = "19",

doi = "10.3390/e23010126",

language = "English",

volume = "23",

journal = "Entropy",

issn = "1099-4300",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "1",

}

TY - JOUR

T1 - Information-theoretic generalization bounds for meta-learning and applications

AU - Jose, Sharu Theresa

AU - Simeone, Osvaldo

PY - 2021/1/19

Y1 - 2021/1/19

N2 - Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.

AB - Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.

KW - meta-learning

KW - generalization bounds

KW - mutual information

KW - noisy iterative algorithms

U2 - 10.3390/e23010126

DO - 10.3390/e23010126

M3 - Article

SN - 1099-4300

VL - 23

JO - Entropy

JF - Entropy

IS - 1

M1 - 126

ER -

Information-theoretic generalization bounds for meta-learning and applications

Abstract

Keywords

Access to Document

Fingerprint

Cite this