Machine learning is gradually becoming a valuable tool which augments research and discovery in many industries, including healthcare. Machine learning models require massive amounts of unbiased, diverse, and easily accessible data to be effective.
Too often, however, datasets remain confined to silos within their respective healthcare organizations because of privacy concerns, keeping valuable potential insights from being realized through collaboration. In healthcare, the prospect of sharing data for machine learning is made more challenging by strict regulations related to patient privacy.
Understanding Federated Learning
Federated learning is a solution designed to facilitate distributed machine learning by addressing issues related to data governance and privacy. With federated learning, the development of a machine learning model is fragmented, with decentralized training of individual model copies at participating healthcare institutions using these institutions’ own proprietary data. The individual models can then be aggregated to produce a global model. Therefore, federated learning allows healthcare institutions to safely train shared models on their joint, private data without having to exchange it with one another.
Research has shown that this approach can produce models comparable to those trained on a single, centrally hosted set of original raw data. Thus, federated learning holds promise for the development of machine learning models that produce granular insights and unlock new insights — such as on effective treatments — while considering patient privacy and regulatory compliance.
Current Uses of Federated Learning in Healthcare
Federated learning is already being used in healthcare for a wide range of applications. In medical imaging and analysis, federated learning is being applied for whole-brain and tumor segmentation. Federated learning has also enabled models in the ABIDE project to operate on sensitive fMRI imaging data for the identification of disease biomarkers.
Several efforts are underway to amplify this approach by connecting multiple healthcare institutions. In France, the HealthChain project is focused on establishing a massive dataset for federated learning in healthcare that includes four different hospitals. The goal is to predict treatment responses for melanoma and breast cancer patients. By analyzing dermoscopy images and histology slides, federated learning can provide oncologists with additional information so they can determine the most effective course of treatment for individual patients.
Federated learning is also being applied to industrial healthcare research, often with competing companies in collaboration with one another. The Melloddy project is focused on establishing a federated learning framework for datasets from 10 different pharmaceutical companies. The goal is to create a shared predictive model that can infer how proteins will bind to chemical compounds. This promises to optimize drug discovery processes at each participating company without sacrificing extremely valuable in-house data.
As a valuable tool enabling data privacy protection, federated learning has massive implications for patients. If a federated learning framework could be established on a broad international scale, it would support high-quality clinical decisions regardless of location. Patients located in developing countries and remote areas would have access to the same high level of healthcare decision-making as patients in world-renowned hospitals. Federated learning could also help doctors address rare diseases and combat emerging viruses before they become global pandemics.
Concurrently, the ability to ensure patient privacy lowers the bar for people considering participation in clinical trials. While federated learning holds much promise for healthcare applications, it’s also important to understand its limitations.
Limitations of Federated Learning
While there are many benefits to federated learning, it does not address all potential issues related to privacy. To be clear, there is some inherent privacy risk associated with the use of machine learning algorithms, because the use of machine learning may involve trade-offs between privacy and performance. And while federated learning offers a level of privacy that standard machine learning models do not, the critical need for complete protection when it comes to handling personally identifiable information means that privacy risk must be carefully evaluated.
With federated learning, participants never provide direct access to their own raw data. Participants only exchange model parameters for aggregation. However, all machine learning models are capable of incorporating private data. Because of this, participants need to use privacy-enhancing measures.
When organizations enter into a federated learning framework, they must agree to the scope and goals of the project, as well as the technologies to use. If a participant were to vary from these agreements, the entire project could be compromised.
For obvious reasons, organizations usually aim to partner with other organizations they deem trustworthy. There are also times when organizations will enter into large-scale federated learning systems. These large initiatives can be more vulnerable to bad actors who may intentionally try to degrade performance or extract sensitive information from other participants. Organizations that enter into large federated learning systems must therefore have a security strategy in place that can mitigate associated risks.
The TripleBlind Solution Facilitates Collaboration between Healthcare Organizations
TripleBlind’s novel privacy-enhancing technology can supplement the privacy afforded by federated learning, ensuring compliance with HIPAA and the secure operationalization of data.
The TripleBlind solution addresses many of the inefficiencies and inherent vulnerabilities associated with federated learning and can also be used as a stand-alone privacy technology for training machine learning models. For example, the distributed learning algorithm of TripleBlind is baked with a specialized privacy function that mitigates membership inference attacks, while reducing the computational requirements by more than 60% at the client side. After all, TripleBlind’s innovations are built on well-understood principles, including federated learning and multi-party compute, but radically improve the practical use of privacy-preserving technologies with faster processing and scalability. TripleBlind requires less computational resources and it avoids a partial network inadvertently passing private data into a composite model. In other words, TripleBlind can help healthcare organizations more easily unlock the intellectual property value of their data through collaboration.
To learn more about our scalable privacy-enhancing technology, schedule a demo today.