author photo
By SecureWorld News Team
Mon | Apr 2, 2018 | 7:34 AM PDT

The cloud has made a lot of things possible that simply were not possible beforehand.

But at least some of those things make privacy watchdogs squirm.

One example of this is Machine Learning as a Service (MLaaS). You have raw data and you want to turn it into predictive analytics. Here comes the cloud to the rescue:

"Major cloud operators offer machine learning (ML) as a service, enabling customers who have the data but not ML expertise or infrastructure to train predictive models on this data. Existing ML-as-a-service platforms require users to reveal all training data to the service operator," says a new research paper by the University of Texas at Austin and Cornell University.

Revealing all your training data is like dropping the shroud of privacy around information that could be crucial to your competitive advantage. Or crucial to your customer's privacy.

Are you okay handing this over to a third party, who realistically, may also be handing it over to a third party, as well?

Many companies and practitioners in InfoSec are not. 

New model for privacy and machine learning

Now researchers have developed a model for what they say is  "Privacy-preserving Machine Learning as a Service."

"We present Chiron, a system that enables data holders to train ML models on an outsourced service without revealing their training data. The service provider is free to choose the type of the model to train, how to configure and train it, and what transformations, if any, to apply to the inputs into the model. These choices can adaptively depend on the user’s data and ML task. The user obtains API access to the trained model but no other information about it."

The Chiron model of Machine Learning that maintains privacy combines a sandbox with an ML toolchain.

Researchers trust no one to protect privacy

The University of Texas and Cornell University researchers that developed this new model say they approached this problem as if no part of the MLaaS chain was secure.

"The attacker could be the machine’s owner and operator, a curious or even malicious administrator, or an invader who has taken control of the OS and/or hypervisor. The attacker may own a virtual machine (VM) physically co-located with the VM being attacked or she could even be a malicious OS developer and add functionality that directly records user input. Therefore, Chiron aims to prevent the untrusted code used during training from exfiltrating secrets about the training data to the underlying platform."

You can read the research paper on the model for protecting data privacy, which runs 12 pages plus citations.

It details specific parameters and costs of operating this way, and also spells out what is included in the research team's threat model when it comes to cybersecurity.

If this model plays out in the real world, it could be a case of having your cake and eating it too.

And we could all use more of that, right?

For more on this topic, read Rebecca Herold's article, "10 Big Data Analytics Privacy Problems."

Comments