Scalable Privacy-Preserving Distributed Extremely Randomized Trees For Structured Data With Multiple Colluding Parties
Amin Aminifar, Fazle Rabbi, Yngve Lamo
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:14:54
Today, in many real-world applications of machine learning algorithms, the data is stored on multiple sources instead of at one central repository. In many such scenarios, due to privacy concerns and legal obligations, e.g., for medical data, and communication/computation overhead, for instance for large scale data, the raw data cannot be transferred to a center for analysis. Therefore, new machine learning approaches are proposed for learning from the distributed data in such settings. In this paper, we extend the distributed Extremely Randomized Trees (ERT) approach w.r.t. privacy and scalability. First, we extend distributed ERT to be resilient w.r.t. the number of colluding parties in a scalable fashion. Then, we extend the distributed ERT to improve its scalability without any major loss in classification performance. We refer to our proposed approach as k-PPD-ERT or Privacy-Preserving Distributed Extremely Randomized Trees with $k$ colluding parties.
Chairs:
Zekeriya Erkin