Noise-efficient learning of differentially private partitioning machine ensembles

Zhanliang Huang*, Yunwen Lei, Ata Kaban

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

35 Downloads (Pure)

Abstract

Differentially private decision tree algorithms have been pop- ular since the introduction of differential privacy. While many private tree-based algorithms have been proposed for supervised learning tasks, such as classification, very few extend naturally to the semi-supervised setting. In this paper, we present a framework that takes advantage of unlabelled data to reduce the noise requirement in differentially private decision forests and improves their predictive performance. The main ingredients in our approach consist of a median splitting criterion that creates balanced leaves, a geometric privacy budget allocation technique, and a random sampling technique to compute the private splitting-point accurately. While similar ideas existed in isolation, their combination is new, and has several advantages: (1) The semi-supervised mode of op- eration comes for free. (2) Our framework is applicable in two different privacy settings: when label-privacy is required, and when privacy of the features is also required. (3) Empirical evidence on 18 UCI data sets and 3 synthetic data sets demonstrate that our algorithm achieves high utility performance compared to the current state of the art in both supervised and semi-supervised classification problems.
Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases
Subtitle of host publicationEuropean Conference, ECML PKDD 2022, Grenoble, France, September 19–23, 2022, Proceedings, Part IV
EditorsMassih-Reza Amin, Stéphane Canu, Asja Fischer, Tias Guns, Petra Kralj Novak, Grigorios Tsoumakas
Place of PublicationCham
PublisherSpringer
Pages587–603
Number of pages17
Edition1
ISBN (Electronic)9783031264122
ISBN (Print)9783031264115
DOIs
Publication statusPublished - 17 Mar 2023
Event European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - Genoble, France
Duration: 19 Sept 202223 Sept 2022
https://2022.ecmlpkdd.org/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume13716
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
Abbreviated titleECML PKDD 2022
Country/TerritoryFrance
CityGenoble
Period19/09/2223/09/22
Internet address

Keywords

  • Differential privacy
  • Noise reduction
  • Ensembles

Fingerprint

Dive into the research topics of 'Noise-efficient learning of differentially private partitioning machine ensembles'. Together they form a unique fingerprint.

Cite this