Jumble of Acronym Letters

Data Privacy Regulations: The Top 15 Acronyms to Know

Every industry is laden with acronyms, but if you work in finance or healthcare, you’ve probably noticed a new flood of acronym-filled privacy regulations in recent years! 

To help you navigate the “alphabet soup” of privacy acts, we’ve created this list of data privacy regulation acronyms with brief explanations of what they mean for you.

Since the US does not currently have an all-encompassing law to handle data privacy for all situations, there’s a patchwork of different laws and standards depending on sectors and circumstances. These are usually represented as an acronym, and can get confusing rather quickly. In addition, we have regulations like the GDPR that aren’t actually US laws, but rather for the EU – adding international considerations to the mix.

A growing number of states (such as California and Colorado) have passed state laws to protect their own residents. Since it’s difficult for a typical organization to accurately determine whether a consumer (or website visitor) is a resident of one of these states, most organizations find it’s in their best interest to err on the side of caution and treat all consumers with the same protections. This way, as federal laws catch up with state and international laws, you have a significant head start to stay compliant.

To understand what this looks like, let’s start our acronym definitions with California’s privacy laws:

CCPA — California Consumer Privacy Act

The CCPA gives California residents data privacy rights and protections, including (1) knowledge of the personal information collected about them and how it’s shared, (2) the right to delete such information, (3) the right to opt out of the sale of such information, and (4) the right to non-discrimination as a result of exercising these rights. California’s privacy regulations are considered some of the strongest in the US, as they give residents some ability to sue a company for certain types of data breaches.

The California Consumer Privacy Act (CCPA) becomes fully operational on Jan. 1, 2023. For a more detailed explanation, see our CCPA compliance guide.

CPRA (or “CCPA 2.0”) — California Privacy Rights Act

The CPRA significantly amends and expands the CCPA, so it is sometimes referred to as “CCPA 2.0.” It adds two additional consumer rights to the CCPA, which are the right to correct inaccurate personal information, and the right to limit the usage and disclosure of sensitive personal information (SPI). This privacy protection includes everything from social security numbers and ethnicity to genetic data and geolocation.

The CPRA took effect in December 2020, with some provisions not becoming fully active until Jan. 1, 2023.

CPA / ColoPA — Colorado Privacy Act

Going into effect in January 2023, the CPA gives Colorado residents increased privacy rights, such as the right to access, correct, and delete their personal data. It also lets people opt out of things like the processing of their data for targeted advertising, the sale of their personal data, and profiling. It has a lot in common with California’s CCPA and Virginia’s VCDPA, but in some ways it is stricter because it requires companies to obtain prior, affirmative consent from the consumer in order to process sensitive data.

COPPA — The Children’s Online Privacy Protection Act

COPPA is a 2013 Federal law that gives parents control over what information websites can collect from children under the age of 13, and sets rules for commercial websites and online services directed to such children, such as mobile apps. This act carries many of the same protections you see with state laws, but puts those data controls in the hands of parents.

ECPA — Electronic Communications Privacy Act

The ECPA is a 1986 law that updated the Federal Wiretap Act of 1968, which had addressed interception of telephone conversations, but did not apply to computers and other electronic/digital communications. It now protects wire, oral, and electronic communications while those communications (1) are being made, (2) are in transit, and (3) when they are stored on computers.

FCRA — Fair Credit Reporting Act

The FCRA is a 1970 act for financial data regulation. It protects information collected by consumer reporting agencies such as credit bureaus, medical information companies and tenant screening services. The FCRA also limits who is allowed to see a credit report, what the credit bureaus are allowed to collect, and how that information is obtained. Information in a consumer report cannot be provided to anyone who does not have a purpose specified within the regulation.

FERPA — Family Educational Rights and Privacy Act

FERPA is a Federal law that protects the privacy of student education data and records. It only applies to schools who receive funds under certain programs of the US Department of Education.

FTC Act — The Federal Trade Commission Act (FTC Act)

The FTC Act has a wide jurisdiction over commercial entities to prevent unfair or “deceptive trade practices.” For instance, it empowers the FTC to go after an app or website that violates its own privacy policy. They can also investigate violations of marketing language when it relates to privacy (for instance, selling user data despite claiming you don’t). There is also a growing effort to get the FTC to expand this power to better address abusive data practices.

GDPR — General European Data Protection Regulation

Despite its widespread impact on US businesses and consumers, the GDPR is not American regulation, but European. It is one of the most comprehensive data protection regulations in the world, covering personal data protections, data usage restrictions, privacy standards, and much more. 

With applications for nations inside and outside of the EU, any organization seeking to work with European companies or users must comply with requirements set forth by the GDPR. To ensure your organization is up to this standard, you can review our GDPR compliance guide.

GLBA — Gramm-Leach-Bliley Act

The GLBA requires consumer financial products (e.g. loan services or investment-advice services), to explain how they share data. The GLBA also enables customers to opt out of this. The law doesn’t restrict how companies use the data they collect however, as long as they disclose such usage beforehand. The GLBA also takes some steps to encourage the security of personal data as well.

HITECH — Health Information Technology for Economic and Clinical Health Act

The HITECH Act of 2009 provides financial incentives for organizations to adopt electronic health records (EHRs) and aims to improve privacy and security protections for healthcare data. Part of this regulation seeks to address the privacy and security concerns associated with the transmission of such health data. As a result, there are several provisions to strengthen the enforcement of the HIPAA rules (see HIPAA below), such as increased penalties for HIPAA violations. Check out our HIPAA & HITECH compliance guide for more information.

HIPAA — Health Information Portability & Accountability Act

HIPAA is a healthcare data privacy law from 1996. As a Federal law, it created national standards to protect sensitive patient health information from being disclosed without a patient’s consent or knowledge. It includes a Privacy Rule to give patients control over who can see their medical records (and how providers can use the data), and a Security Rule to maximize safeguards for transmitting patient health information. Additionally, HIPAA includes a Breach Notification Rule to improve transparency when there is a breach. 

To make sure your organization is complying with HIPAA standards, you can read our HIPAA & HITECH compliance guide.

NY SHIELD Act — Stop Hacks and Improve Electronic Data Security Act

The SHIELD Act is a 2019 law that created more data security requirements for companies that collect information on New York residents. While smaller in scope than other state privacy laws, SHIELD protects New York residents’ private information by requiring organizations to “develop, implement, and maintain reasonable safeguards to protect the security, confidentiality, and integrity of the private information.”

UCPA — Utah Consumer Privacy Act 

Drawing heavily from California, Virginia, and Colorado, UCPA applies to a smaller subsection of businesses (those making over $25 million in annual revenue) and offers broader exceptions. It gives consumers rights such as deleting their data, getting a copy of their data, opting out of having their personal data processed (or used for advertising), In contrast to the VCDPA and CPA, the UCPA does not include the right to opt out of profiling. 

Financial institutions governed by the Gramm-Leach-Bliley Act and information in the Fair Credit Reporting Act aren’t subject to the UCPA, which goes into effect at the end of 2023.

VCDPA (or simply CDPA)— Virginia Consumer Data Protection Act

Similar to California and Colorado regulations and the GDPR, the VCDPA gives Virginia consumers certain rights over their data and sets rules for certain companies on the data they collect, including what they collect, how it’s treated, how it’s protected, and whom they can share it with. The CDPA requires companies to help consumers exercise these rights, such as by obtaining opt-in consent before processing sensitive data, and providing a clear privacy notice that gives consumers a means to opt out of targeted advertising.

The VCDPA goes into effect Jan. 1, 2023.


Note: at the time of this writing, other bills are being introduced around the country, in states such as Connecticut, Hawaii, Massachusetts, Minnesota, Oklahoma, and Wisconsin. Most state laws seem to be following California’s standards.

As regulations increase on new technologies, industry hype takes a dive. This trend is no different for healthcare data, analytics, or AI. Learn how to strategically navigate talent and governance barriers in healthcare in our #1 most popular whitepaper from Gartner.

451 Webinar Recap hero image

Balancing Data Confidentiality with Utility

In recent years, we’ve seen a massive flood of data in data-driven fields like finance and healthcare. This has led to innovations in business models, AI, and data collaboration. However, data confidentiality can complicate how useful this data can actually be.

So, how do organizations balance the business utility of their data with the need for privacy and data protection? 

In this webinar, Justin Lam and Chris Barnett discuss ways enterprises can move both their data confidentiality & utility forward together. Justin Lam is a data security research analyst at 451 research, a part of S&P Global Marketing Intelligence. Chris Barnett is TripleBlind’s VP of Marketing and Partnerships and oversees all go-to-market strategies for TripleBlind AI.

The state of data usability and privacy in business

In many cases, enterprises and their partners are seeking to analyze or monetize non-public or personally identifiable information. However, this tends to be at odds with growing body of privacy regulations and intellectual property concerns.

The trend: more data-driven companies

With the benefits of faster preparation, analysis, and decision making, more companies are adopting a data-driven approach in today’s market. Decisions can be made with greater confidence, and mistakes can be addressed faster when the data doesn’t support them.

Bar Graph: The most significant benefits of being more data-driven


Market intelligence research by 451 found 64% of respondents said “most” to “nearly all” of their strategic decisions were data-driven. While this is true, it is worth noting that some organizations may not have the need to become more data-driven (some business models simply don’t require it), while some are in a position where they can’t make the shift (due to current processes, budgets, low levels of data, etc.)

Second-order trend: more overlap of internal teams

With this central role data now plays in fields like finance and healthcare, internal processes must adapt as well. 

One adaptation is that data security teams, dev & product teams, and GRC (governance, risk management, and compliance) are working together more frequently, and more meaningfully. It’s become increasingly important that these teams can understand each other’s functions and how to account for each other’s priorities in their own work.

This introduces an important set of questions, such as:

  • Can the security teams account for some of the development in some of the AI/ML models?
  • Can researchers and developers proactively incorporate security into their technology stacks?
  • Can they translate technical controls they’ve put in place (such as data encryption, data classification) in order to satisfy legal requirements?
  • Can legal teams understand technical controls put in place and translate them?

Current challenges to data utility

While many organizations understand the benefits of being more data-centric, there are still some major obstacles.

The main concerns we hear from our customers center around:

  • Budget limitations
  • Integrating existing technology
  • Security concerns
  • Privacy/governance
  • Not enough skilled personnel

Of these, privacy and governance are the most frequently mentioned. Managing legal frameworks can drastically limit data utility if you don’t have the right strategy or technology in place.

A 451 study found that among companies who identify as data-driven, “data utility” (quality and consistency) was one of the top barriers to having better data practices (the second was security, and the third was privacy).

Bar Chart: Barriers to becoming more data driven, Drivers vs. Drifters


The future of these emerging trends

Innovation and experimentation will continue, but many projects will continue to get abandoned due to issues like those mentioned above.

There are a number of reasons for this abandonment.

First, many businesses launch AI/ML projects without fully understanding such obstacles, and many of these projects (39%) get abandoned for this reason. More than half of organizations unable to get access to the required data will ultimately abandon all relevant projects.

Additionally, environmental, social, and governance (ESG) aspects of AI/ML are also becoming increasingly relevant.This is especially due to the environmental impact of AI/ML (such as processing required to train data models), immoral or unethical usage (from privacy issues to discrimination), and nascent government regulation in the field, where lawmakers and businesses alike are still trying to figure out the best way forward.

ESG issues around AI/ML have hit the mainstream in the last few years (especially in 2022), prompting greater concerns about their impact (451’s research found 74% of people are now “somewhat” to “very” concerned about the privacy of their data online).

Businesses are prioritizing AI/ML initiatives in greater numbers, forcing more companies to get involved if they want to stay competitive. They have way more data, so they want to know how to monetize it for themselves, for their stakeholders, and their customers. In 451’s survey, they found 82% of businesses say data marketplaces (to buy and sell data) will likely be in their top 5 priorities in the coming three years.

What’s next: how can organizations respond?

To balance data utility with confidentiality, businesses can turn to privacy-enhancing (or privacy-preserving) technologies, known as PET. There are four key areas TripleBlind’s PET can help here: data aggregation, invention of new uses, distribution of models and data, and model verification.


Data is often siloed, typically for legal reasons to protect privacy. Instead of physically aggregating data,​TripleBlind allows you to “logically” aggregate it, which reduces the need for storage and transmission costs and mitigates regulatory burdens. You can then leverage all siloed data sets, without the hassles and expenses of doing so via traditional methods.

Invention of uses:

With so much new data, companies can now find novel ways to get value from that data. PET helps improve the efficiency of this process, as well as the extendibility, allowing companies to get more use out of the data by securely leveraging a more valuable pool of data sets. TripleBlind also provides exploratory data analysis and GUI-based search to discover interesting data sets.

Distribution of models/data

Enterprises want to be able to deploy AI models and other algorithms for use by others, without exposing the underlying IP. Thankfully, sharing data doesn’t require you to sacrifice privacy, and PET makes it much easier: you can use TripleBlind’s router to deliver algorithm results to any global location, with technology in place to protect critical IP and patient data. This allows you to skip the impasse and legal hurdles of sharing models and data, without all the risks.

For monetization and monitoring purposes, TripleBlind still provides you with a “backend” audit trail to track usage and bill customers.

Verification of models

Typically, models are done on specific populations or datasets. It’s difficult to determine the model’s relevance to the broader population due to issues like data bias. For instance, blood test data from an exclusively American sample group generally can’t extrapolate to an Indian population, so that model would need verification to determine its limits and biases.

TripleBlind’s statistical tools let you detect model drift or bias, so you can validate your models and use them on different populations and countries.

For more information on TripleBlind, check out our whitepaper on the underutilization of data.

Or learn about our recent survey on what CDOs are thinking about data privacy.

Sean Cardenas VP of Sales Press Release Hero

TripleBlind Appoints Sean Cardenas as Vice President of Sales

Technology Veteran To Lead Customer Acquisition Strategy As Demand for Data Privacy Grows

KANSAS CITY, Mo., Oct. 27, 2022 – TripleBlind, the leader in privacy-enhancing computation (PEC), today announced the addition of Sean Cardenas to the executive team as Vice President of Sales. Cardenas will be responsible for leading and scaling the sales function as the company moves into its next stage of growth. With nearly three decades of industry experience, including having scaled multiple organizations through IPO, Cardenas will be pivotal to accelerating revenue, driving customer acquisition, expanding into new markets and geographies, and delivering on business strategies. 

“Data continues to be one of the most valuable and untapped assets within an organization. When we met Sean, he immediately understood our vision and mission and is a believer in the importance of maintaining privacy while garnering important insights from data,” said Riddhiman Das, CEO and co-founder of TripleBlind. “We’re excited to welcome Sean to the team and support him as he works to get our technology in front of organizations across the globe.”

Cardenas joins TripleBlind from Incopro, where he led global sales and was responsible for scaling North American operations. He previously held senior sales roles at Nutanix, Plethora and Data Domain among other technology businesses from start-ups to global corporations. 

“TripleBlind’s technology is impacting lives every day, from healthcare organizations providing better diagnostics, to financial institutions helping protect customer data,” said Sean Cardenas, Vice President of Sales of TripleBlind. “The TripleBlind team is impressive, their mission is meaningful and I’m excited to bring my expertise to the company as we work to help organizations get the most out of their data while maintaining the highest level of privacy.”



  • TripleBlind Product Introduction video 
  • TripleBlind’s Chief Data Officer Survey 
  • Follow TripleBlind on LinkedIn
  • Follow TripleBlind on Twitter

TripleBlind is growing quickly and looking for new team members. To learn more, please visit: https://tripleblind.com/careers/


About TripleBlind

TripleBlind has created the most complete and scalable solution for privacy enhancing computation by combining data and algorithms while preserving privacy neutrality and ensuring compliance with all known data privacy and data residency standards, such as HIPAA and GDPR. The TripleBlind solution is software-only and delivered via a simple API. It solves for a broad range of use cases, with current focus on healthcare and financial services. The company is backed by Accenture, General Catalyst and Mayo Clinic. To learn more, visit https://tripleblind.com/ or contact us via email here: contact@tripleblind.com.


Media Contact

Stephanie Schlegel
Offleash for TripleBlind


Seqster partnership hero image

TripleBlind Announces Partnership with SEQSTER to Accelerate Data-Driven Patient Outcomes

Pharmaceutical companies and healthcare organizations will now have access to plug-and-play compatibility for clinical trials and beyond.

KANSAS CITY, Mo., Oct. 20, 2022 — TripleBlind, the leader in privacy-enhancing computation (PEC), today announced a technology and go-to-market partnership with SEQSTER PDM Inc. (“SEQSTER”), the leader in patient-centric healthcare data technology, to develop a privacy preserving end-to-end data solution. This new partnership will enable healthcare organizations and pharmaceutical companies to gain important insights from data and improve patient outcomes.

“SEQSTER has solved one of the biggest problems in healthcare: patient real-time access to health data, so why shouldn’t we do the same for healthcare organizations?” said Riddhiman Das, Co-founder and CEO of TripleBlind. “Together, TripleBlind and SEQSTER are now able to provide researchers and data scientists with access to incredibly valuable data that will spur innovation in the healthcare industry and, ultimately, save lives.”

Historically, healthcare data has been siloed and things like clinical trials have required manual data entry. As the healthcare industry continues to move towards digital transformation, health organizations increasingly need the ability to aggregate multiple, disparate data sources in real time to gain actionable insights for research and development. TripleBlind and SEQSTER’s privacy preserving end-to-end data platform reduces the costs associated with digital evolution in the healthcare industry, while more effectively and efficiently protecting the privacy of data.

SEQSTER automates that process and combines the data in one operating system that allows pharmaceutical researchers the ability to run studies that are more effective and efficient. With TripleBlind, the data is quickly and easily analyzed, resulting in actionable yet anonymized data that can impact everything from identifying off-label use cases to better patient interactions to improved processing of claims and beyond. This is all done with full regulatory compliance, including HIPAA.

“TripleBlind is the only company that is able to maintain the level of privacy needed to be HIPAA and GDPR compliant, while providing the ability to learn from data and this is exactly what our customers have been looking for,” said Ardy Arianpour, CEO of SEQSTER. 

“We’ve tackled the biggest issues in healthcare – data accessibility and data interoperability – yet there’s still a major opportunity to more efficiently use data from clinical trials to improve patient reported outcomes. To say that I’m excited about our new partnership with TripleBlind is an understatement- the combination of these two technologies will be impactful for researchers, patients, and Pharma enterprises at scale.”

To learn more about the partnership or begin using Tripleblind and SEQSTER, please visit: https://tripleblind.com/ or https://www.seqster.com/


  • TripleBlind Product Introduction video 
  • TripleBlind’s Chief Data Officer Survey 
  • Follow TripleBlind on LinkedIn
  • Follow TripleBlind on Twitter

TripleBlind is an exhibitor at HLTH 2022 from November 13-16, 2022 in Las Vegas, NV. To learn more, visit TripleBlind at booth #1120. 


About TripleBlind

TripleBlind has created the most complete and scalable solution for privacy enhancing computation by combining data and algorithms while preserving privacy neutrality and ensuring compliance with all known data privacy and data residency standards, such as HIPAA and GDPR. The TripleBlind solution is software-only and delivered via a simple API. It solves for a broad range of use cases, with current focus on healthcare and financial services. The company is backed by Accenture, General Catalyst and Mayo Clinic. To learn more, visit https://tripleblind.com/ or contact us via email here: contact@tripleblind.com.



Seqster is the leading healthcare technology company that breaks down health data silos at scale. Its enterprise operating system aggregates disparate health data sources into a single, 360-degree view of a patient in real-time, solving a multitude of challenges for life sciences, patient engagement and data interoperability.

Seqster has nationwide coverage of EHRs from hospitals and medical groups, genomic DNA, wearables, pharmacy and social determinants of health data. Through its customizable white-label approach, Seqster provides accelerated access to de-identified, tokenized, real-time data and comprehensive curated data to address critical needs across the healthcare continuum.

Seqster is privately held and headquartered in San Diego. To learn more about the Seqster Operating System for Patient Registries, Clinical Studies and the Digital Front Door, please contact us at info@seqster.com or visit www.seqster.com.


Media Contact

Stephanie Schlegel
Offleash for TripleBlind

TripleBlind CEO: Orgs Need To Prepare for CCPA and the Future of Compliance


Data Centric Data Privacy hero image

Data Centric Approach to Data Privacy – Data Centric vs. Model vs. Application

Artificial intelligence technology largely evolved with a focus on rules for the creation of models and the solutions they can provide. Data was assumed to be available for data scientists to use as needed. But AI models are only as good as the data used in them. Simply put, when it comes to machine learning, it’s “garbage in, garbage out.”

Rather than focusing on the model itself, an emerging approach called “data-centric AI” puts more emphasis on optimizing the data to make the technology more adaptable and scalable, while still retaining the ability to produce powerful results.

What is the Definition of a Data-Centric Approach to AI?

Famously advocated for  by renowned computer scientist Andrew Ng, data-centric AI uses analytics and machine learning techniques to ensure that the data used to train a model is high quality, comprehensive, and refined for its purpose. The meaning of a data-centric approach is to reach a high level of performance through good data processing, first and foremost minimizing time spent on refining the model. 

Put very simply, a data-centric approach to AI includes the following steps:

  • Ensuring the appropriate data labels are used
  • Eliminating noise
  • Augmenting data
  • Engineering model features
  • Error analysis
  • Review from subject matter experts to ensure the accuracy of training data

As organizations are realizing the value of adding data-centric AI to their operations, they are increasingly shifting their processes to prioritize access to higher quality, less biased data sets. And as available data sets continue to grow in size and complexity, organizations will prioritize data-centric AI even more heavily. 

Data-Centric vs. Model Centric

Unlike data-centric AI, model-centric AI is focused on developing and refining models to boost system performance. This long-standing approach to AI tends to treat data as a static asset, and the development of models as the main driver of results that must be improved.

While useful, a model-centric approach to AI has challenges. For one, it often leads to the creation of different specialized models that each focus on distinct tasks. This ‘model creep’ can force organizations to manage many different AI systems and datasets. This balkanized approach to AI can also lead to higher data collection costs for disparate tasks and challenges.

Model-centric AI is also not well-suited to shifting conditions or new variables, as dealing with changes can result in significant redeployment delays. There are also challenges related to standardization, as different teams may use different methods while developing AI solutions. In these situations, adapting or scaling AI systems can be a massive undertaking.

Advocates for data-centric AI say these legacy challenges can be addressed by developing systems that are focused on improving data and structuring it in ways that support adaptability, versatility, scalability, and standardization.

Data-Centric vs. Application Centric

The emergence of a data-centric approach to AI mirrors calls for a more data-centric approach to enterprise architecture

 For decades, enterprise architecture has been driven by applications. The result has been a patchwork of approaches to handling data and a severe lack of interoperability.

An emerging approach to enterprise architecture is to be data-centric rather than app-centric. As with artificial intelligence, people are seeing the value in putting data first, rather than seeing data as it means to an end.

Real World Approaches to Data-Centric AI: Manufacturing

Machine vision technologies powered by AI offer an effective way for manufacturers to identify defective parts in finished products.

One of the biggest challenges in developing this type of system for manufacturers is creating a consistent approach to data management. Without good data on the types of flaws or defects to look for, an AI system can’t properly perform inspections. This problem is more challenging than it seems because human experts can disagree on the ways to label image data, confounding an AI system.

Further complicating matters is the fact that change is a constant in manufacturing. Supply chain issues may force a manufacturer to use slightly different parts. New product lines might be unveiled. Environmental changes such as different lighting or humidity can change throughout the day or the seasons. All of these changes can potentially confound an AI-powered machine vision system.

Since model-centric AI is focused on rules and solutions, this approach is not well suited to the amount of change commonly seen in manufacturing. The result of sticking with a machine vision model in the face of change can be high rejection rates, as the system has difficulty discriminating between acceptable variation and actual defects. Adjusting this problem may require significant human intervention, leading to higher costs and production slowdown. Developers may have to spend weeks and months consulting with quality control professionals and refining machine vision models based on those consultations.

Through a data-centric AI approach, quality control experts and developers wouldn’t have to wait until problems arise before collaborating. Collaboration during the development phase would help to clearly define data, build a model around that data, assess results, and optimize the model. The resulting data-centric model would minimize back-and-forth further down the road.

A data-centric approach would also be better positioned for scaling as the manufacturer adds production lines and new facilities. Standardized methods for collecting and processing data would facilitate future training, refining, and updating of new models.

Why You Should Use TripleBlind with Your Data-Centric Approach to AI

To take a data-centric approach to AI, you obviously need access to sufficient amounts of unbiased, accessible data – which has traditionally been quite hard to come by. Until now! 

Through the innovative TripleBlind Solution, our clients can access more data through secure collaborations. Our technology offers true scalability and faster processing compared to other competing technologies. TripleBlind also supports all data and algorithm types, while protecting both types of proprietary assets: Organizations can safely collaborate without worrying about loss of either their data or their algorithms.

If you would like to learn more about how our technology supports data-centric AI, check out our Blind AI Tools or download our Whitepaper. We remove common barriers to using high-quality data for artificial intelligence, solving key challenges AI professionals face with data access, bias, and prep. Through a combination of privacy-enhancing techniques, the TripleBlind Solution allows for training of new models on remote data –– without compromising the privacy or fidelity of sensitive data. Let us show you how by booking a live demo today.


Data Challenges hero image

The Challenges of Data-Centric AI & Why You Should Still Shift to it

Is it time for a paradigm-shift in artificial intelligence development?

Model-centric AI used to be the pinnacle of machine learning, with a focus on modifying model architecture to increase meaningful output. Like designing a more powerful car engine or optimizing rotor blades on a wind turbine, model-centric AI prioritizes structural changes to increase efficiency and improve data outcomes.

However, modifications to model architecture can only go so far. Any AI solution requires two key components: clear code (the model), and clean data. Much like how you wouldn’t put the lowest-grade fuel into a high-capacity Ferarri, machine learning models require high-quality data to execute operations quickly and accurately. Without it, no AI solution can truly reach its maximum potential.  

After years of intense focus on improving artificial intelligence models, the AI industry is rapidly changing lanes. If a machine learning model is only as effective as the data used to train it, then it’s time for a shift to data-centric AI.. However, change can be difficult, and given the relative newness of data-centric AI, there is understandably some doubt about jumping on the bandwagon.

Model-centric AI, with its focus on developing and improving models, has been the prevailing approach to AI to date. But the emerging data-centric approach to AI is predicated on the idea that performance can be improved further by putting more attention on the systems used to collect and process training data.

Having large volumes of high-quality data is essential to the creation and maintenance of AI systems. Unfortunately, data often doesn’t come cheap. It’s common for companies to spend six months or more of legal business development time just to get access to a single dataset. Due to its overall value, it’s understandable that the AI industry would start to prioritize data as the means to an end. Data-centric AI methods look to extract even more value by investing more in the curating, labeling, augmenting, and managing of data.

Curating Data

Real-world data is usually private and proprietary to the organization that collected it. One key aspect of data-centric AI is increasing an organization’s access to data.

Labeling Data

Labeling data typically requires input from subject matter experts, many of whom have other important priorities. For example, medical doctors might be required to properly label X-rays for an AI system that processes medical imagery, while attorneys might be required to label legal documents for a law-focused system.

Augmenting Data 

It’s common for the data AI systems handle to change over time, as well as the systems’ purpose. Because of this need for change, training data must regularly be updated and possibly relabeled to reflect various changes.

Managing Data

When a data set has a massive amount of manually labeled records, it raises issues related to governance and auditing. Organizations overseeing large volumes of data must have systems in place for identifying bias, fixing quality issues, conducting audits, and tracing the lineage of model flaws.

The Three Main Challenges of Data-Centric AI

Companies looking to adopt data-centric AI often face three primary data challenges: data volume, consistency, and quality.

  • Volume. Supplying a large volume of high-quality data to an AI model means eliminating low-quality datasets. As a result, a data-centric approach often requires more data volume than a model-centric approach. However, addressing this issue by blindly collecting as much data as possible can be inefficient and costly. Before acquiring more data, organizations leveraging data-centric AI must establish the kind of data that is needed. 
  • Consistency. Without consistent data annotation, an AI model quickly becomes unreliable. Unfortunately, achieving a high level of consistency is quite difficult. A study from MIT revealed approximately 3.4 percent of data records in popular datasets were mislabeled. Furthermore the MIT study found that larger, more powerful models tend to be more greatly impacted by poor labeling consistency. Organizations looking to leverage data-centric AI must prioritize an effective system of data annotation, partly based on machine learning engineers having a deep understanding of their datasets.
  • Quality. The data used to train an AI system should be representative of the data that a model will process after deployment, including any rare variations. Furthermore, attributes of data records that are not causal features should be randomized during training as part of quality control measures. 

Taking these steps can mitigate two of the common flaws associated with poor dataset quality:

  • Incorrect correlations. Dubious associations occur when a machine learning model associates non-causal data with a label. For example, if an image-processing model was trained on images of cows that always appeared in grasslands, a deployed model might identify pictures of cows as “grass”.
  • Insufficient variation. When a model isn’t trained on a data set with adequate variation, it can result in the model doing a poor job of making generalizations. For example, an image processing model that was trained only on images showing daylight might fail to perform well on nighttime images.

Benefits of Data-Centric AI

While there are major data-centric AI challenges, organizations that can overcome them stand to reap several key benefits:

  • More reliable and less biased. Avoiding “garbage-in, garbage-out” as a top priority, data-centric AI is designed to yield more dependable results. Having systems in place that put data first also means prioritizing the elimination of bias.
  • Lower costs through greater flexibility. It is difficult for model-centric AI to realize performance gains without massive investments in new datasets and computing resources. With its bigger focus on maintaining high-quality data, data-centric AI is poised to unlock greater performance gains with much smaller investments in data and resources.
  • Less administration. Model-centric AI is based on the notion of specialization, with models being explicitly designed to perform specific tasks. This can lead to an organization accumulating massive amounts of models and associated datasets. In addition to being unwieldy, this approach can lead to elevated costs compared to a more flexible, standardized approach that revolves around the data itself.

Get the Massive Amounts of High-Quality Data You Need Through TripleBlind

TripleBlind helps organizations adopt data-centric AI by breaking down regulatory and administrative siloes that keep data locked away.

With the TripleBlind Solution, organizations leverage proven privacy-enhancing technology to operationalize sensitive data while remaining compliant with various regulations. Our innovative technology also offers true scalability and faster processing, while supporting all data types.

If you would like to learn more about how our technology can enable the adoption of data-centric AI, contact us today.

Data Centric AI Data hero image

Data-Centric AI Data Quality Framework — Making Data Quality Systematic

For years, improving artificial intelligence algorithms was seen as the way forward when it came to optimizing AI performance, but this legacy approach is quickly becoming outdated.

According to prominent AI expert Andrew Ng, trying to improve an AI system by focusing on the model results in marginal gains, if any. This is partly due to increased commodification and availability of well-performing AI models. The way forward, Ng argues, is a data-centric approach to AI, but this approach isn’t without its challenges.

One of the biggest challenges facing a data-centric approach to AI is developing a solid framework for data quality. When talking about machine learning, we often refer to this idea of needing large datasets containing millions of records, but many available training datasets only include several thousand examples or less. To be clear, many real-world applications or potential applications of AI aren’t likely to involve training datasets with examples in the millions.

However, it is difficult to quantify the most representative size for an AI training dataset. Ng argues that small and mid-sized data sets are sufficient to train a good AI system, but we need to transition from “big data” to “good data.” 

When it isn’t possible to amass a sizable data set, Ng says, AI developers should focus on collecting data that is defined consistently, covers critical cases, includes timely feedback, and is representative. “Good Data” includes factors that are domain- or application-specific, such as healthcare data that adheres to privacy regulations.

Essentially, the shift from big data to good data will require systematic approaches to data quality.

The Key to a Data-Centric AI Data Quality Framework

If data-centrism in AI development requires high quality data, then developers must ensure that data points are labeled consistently. If different labelers use different labeling conventions, it hurts the algorithm’s ability to learn –– thereby affecting the quality of the model’s output.

A systematic approach to labeling recently championed by Ng starts with two subject matter experts independently labeling a dataset. After the dataset has been labeled, the consistency between labelers is quantitatively measured to discover where they agree or disagree. In instances where labelers disagree, the labeling instructions are revised until the labelers perform consistently with each other.

A Data-Centric AI Data Quality Toolkit

Labeling training data for real-world use cases requires the enlistment of subject matter experts, but often, these experts are busy with their primary responsibilities and have little time they can dedicate to manually labeling data points. This is particularly true for subject matter experts working in time-intensive fields like healthcare. Additionally, data and objectives can change after the deployment of an AI system. For example, a new study on MRI imaging could cause a reassessment of an AI system for medical imaging. When something like this happens, new training data may have to be re-labeled.

For massive datasets, the need to manually label and relabel can be a non-starter for most organizations. One automated data quality toolkit being developed by Snorkel AI is called programmatic labeling. In a simple example involving a text-based AI system, a subject matter expert would write out a few key phrases that are then used to iteratively label data points via labeling functions. The company says it is currently developing its programmatic platform for “rapid, data-centric, iterative AI development. In other words, it revolves around modifying, labeling, and managing your data.”

The goal of the Snorkel platform and others like it is to give AI developers the ability to scale up using unlabeled data as rapidly as it would be using labeled data. While skeptics may be concerned about issues like labeling errors and bias, Snorkel says its approach to data-centric AI data quality and reliability has been empirically proven to save person-months, at or above quality parity, in more than 50 peer-reviewed publications.

Supporting Data-Centric AI Data Quality and Safety with TripleBlind

One of the biggest obstacles to data-centric AI development is access to data. TripleBlind’s innovative privacy-enhancing technology helps developers break down obstructive data silos for increased access to valuable datasets. With our innovative TripleBlind Solution, developers can access sensitive data, including healthcare and financial data, while keeping individual privacy sacrosanct and remaining compliant with data privacy regulations like GDPR and HIPAA.

If your company is moving toward a data-centric approach to AI, you should find out how TripleBlind can facilitate the switch.

Diagnostic AI hero image

Data To Diagnosis: Top 10 Ways Artificial Intelligence Will Impact Healthcare

Science fiction often touts that artificial intelligence will be the end of humankind –– but what if AI could improve our quality of living or extend human lifespans instead? In the past decade, researchers, developers, and doctors have worked to turn fictional fantasies into the healthcare industry’s new reality. Now, artificial intelligence in healthcare enables the accelerated development of life-saving treatments, increased operational efficiency in clinical settings, and improved patient outcomes across the board. As more hospitals and research organizations adopt AI/ML into diagnostic and treatment processes, technologies are bound to improve and expand.

The future of artificial intelligence in healthcare is expected to unlock more medical insights we can’t even begin to anticipate, deliver better patient care, and enable a more proactive, results-based approach to medicine. Consider the following ways experts envision the role of AI in the future of healthcare.

1. Addressing Neurological Challenges with Brain-Computer Interfaces

Powered by artificial intelligence, brain-computer interfaces are a promising technology that could improve communication skills and mobility in patients with neurological conditions.

A brain-computer interface uses artificial intelligence to analyze neural signals associated with intended movement. The interface could then activate artificial limbs, mobility-related medical devices, or communication technology to dramatically improve the quality of life for individuals coping with traumatic brain injury, spinal cord injury, ALS, stroke, or other neurological conditions.

2. Less Invasive Biopsies

One particularly promising way AI will change healthcare is through so-called “virtual biopsies”. This promising technology involves the use of image-based artificial intelligence to categorize the phenotypes and genetic qualities of cancerous tumors.

Currently, it will require significant refinement in order for virtual biopsies to become a reality. If the technology does sufficiently develop, clinicians will be able to gain a more comprehensive understanding of how tumors function, as opposed to understanding the properties of a small segment of tumors. This will allow for a better diagnosis of individual cancers and more targeted treatments for patients.

3. More Efficient Administration

Over the past two decades, electronic health records have been transformative, but there are still significant challenges associated with the use of electronic health records, particularly when it comes to overwhelming documentation requirements. Another way that AI will change healthcare is the creation of more intuitive interfaces for documentation and the automation of routine processes related to record keeping.

Artificial intelligence will also increasingly be used to process routine administrative activities, such as providing refills on medication and notifications for test results. AI technology for healthcare administration can also help prioritize tasks for doctors, nurses, clinicians, and other important personnel, optimizing their time management.

4. Another Weapon in the War Against Superbugs

So-called superbugs like antibiotic-resistant strains of Clostridioides difficile are appearing in healthcare settings and becoming a major concern. Artificial intelligence is already being used to track patterns of infection for “C. diff,” and this information is being used to protect at-risk patients. The future of artificial intelligence in healthcare will likely see an expansion of this approach.

5. Better Patient Monitoring

Connected devices are increasingly found in healthcare settings to monitor patients. Aggregating data from all of these devices inside and outside the healthcare system is a Herculean task.

Artificial intelligence systems are capable of handling this, allowing the industry to extract more insights from the multitude of smart devices currently in operation. The use of artificial intelligence in this area can also enable a more proactive approach to patient monitoring. For example, a system could notify doctors when a patient starts to develop sepsis or other negative complications.

6. Improving Immunotherapy to Treat Cancer

By activating the body’s immune system to attack malignant cancers, immunotherapy is emerging as a promising approach to treating cancer. However, current immunotherapy options are only effective on a small number of patients, and researchers currently do not have a reliable method for determining which patients will benefit the most from this treatment option.

Many expect the future of AI in healthcare will involve the analysis of complex datasets related to immunotherapy, allowing for better targeting of patients with this treatment option. Through the analysis of disease pathology, artificial intelligence will also be able to identify new immunotherapy therapy treatment pathways.

7. More Insights from Electronic Health Records

Electronic health records hold a massive amount of readily-accessible patient data, but extracting insights from that data is a major challenge.

In addition to the administrative hurdles that come with aggregating such a massive, widespread dataset, there are also challenges related to record keeping. For example, an algorithm designed to predict stroke based on billing records could actually only be predicting the likelihood of a billing code for stroke, which is very different from predicting the actual medical condition.

Artificial intelligence can help researchers analyze electronic health records with more precision and specificity. Deep Learning technology will be able to locate novel connections within datasets, allowing for the development of new methods of care.

8. Better Use of Wearables

Smartwatches and other digital devices may be taking over the commercial marketplace, but the healthcare industry has yet to fully embrace the treatment possibilities that these devices could be offering. Artificial intelligence is expected to play a major role in making the most out of healthcare data collected from these personal devices. One of the major stumbling blocks to realizing this potential use of artificial intelligence and healthcare is having people get comfortable with sharing their personal data. If the healthcare industry is able to show patients that they can provide adequate privacy protections, it will likely be another way that AI will change healthcare.

9. Smile-Friendly Diagnostics

We know facial recognition software as the technology we use to unlock our iPhones, but the same type of technology, powered by artificial intelligence, could be used to diagnose diseases associated with facial abnormalities. According to a study published in Nature Genetics, a team of German researchers launched a new technology capable of detecting rare diseases based on facial features. The study team said their technology and genetic data could accelerate therapies for patients with rare disorders that manifest in facial abnormalities.

10. Enabling a Fee-for-Results Model

In the United States, there is a massive disparity between the amount of money spent on healthcare and positive outcomes. This disparity has triggered conversations around a fee-for-results model that involves health care providers being paid based on outcomes, not by the number of tests or treatments they provide.

Through prediction and risk analysis, artificial intelligence is able to provide the foundation for a fee-for-results model. Providers can deliver superior results based on more informed, evidence-based decision-making.

Realize the Future of Healthcare with TripleBlind

Data is at the heart of every AI project. Without good, representative data, healthcare AI is at risk for poor performance and bias.

Through our innovative privacy-enhancing solution, TripleBlind has been enabling all manner of artificial intelligence, especially AI in healthcare — which depends on sensitive medical information. If your healthcare organization is looking to unlock more insights from this sensitive data, schedule a demo to see how TripleBlind can unlock the potential of AI.

The Importance Of Ai Trism and the Three Steps To Implement It

With about 42 percent of new patents in 2018 featuring Artificial Intelligence, it’s clear that this technology will be shaping our future. But if these new systems are revolutionizing our world, what additional challenges might they bring to the table?

One prominent challenge is bias. Typically data bias is caused by a skewed, incomplete or non-representative data set. If AI systems make important decisions related to things like hiring or medical diagnoses based on parameters including a person’s gender or race, our society could suffer major consequences. Through such biases, these technologies can easily exacerbate existing disparities in ways that run counter to our moral and legal systems.

Governments are already taking an increased interest in regulating AI, looking to support the technology’s potential while keeping it from running amok. While organizations must be aware of any AI regulation, they must also keep their AI systems secure and trustworthy.

Many are doing exactly that. In a September 2021 report, Gartner cited an emerging market aimed at developing trustworthy and secure AI technology — a category the research organization referred to as AI Trust, Risk and Security Management (AI TRiSM).

The Importance of AI TRiSM

While regulatory structures for AI are still in development, organizations still have a strong incentive to embrace AI trust management.

AI is still a new technology and as such, consumers’ AI understanding and trust level is still relatively low. However, organizations using it can take steps to boost consumers’ AI trust level. A recent survey of AI technologists by Deloitte identified several key customer concerns related to AI technologies:

  • Pre-existing bias. Some customers are simply biased against the technology at this point in its evolution.
  • Insufficient oversight. Customers are worried about a lack of human oversight for AI systems.
  • Unexpected behavior. Perhaps influenced by popular science fiction movies, customers are concerned about AI systems “going rogue.”
  • Lack of understanding. Many customers don’t understand how AI works and fear the unknown.

Organizations can help to address these customer concerns by embracing AI TRiSM principles. This approach to AI trust management, when done correctly, can make systems less risky and more transparent; addressing major concerns. The ultimate goal of AI TRiSM is to keep customers secure while still allowing for growth and innovation.

Three Key Steps to Implementing AI TRiSM

Organizations looking to implement AI TRiSM should be considering a comprehensive, multifaceted framework. Very generally speaking, this framework should be driven by three key steps related to documentation, a system of checks, and a high degree of transparency around the technology.

  1. Implement Strong Documentation and Standard Procedures. Having a strong documentation system not only supports trustworthiness by placing a focus on the data used to train an AI system, but it also enables auditing of the technology in the event that something goes wrong. Documentation systems should be based on both legal guidelines and internal risk assessments. These systems should include both standardized documentation processes and document templates. Additionally, a documentation system should be both consistent and intuitive so that it supports both AI TRiSM and use of the technology.
  2. Use a System of Checks and Balances. Organizations must have systems in place designed to monitor potential bias and prevent a corrupted system from causing serious damage. For example, automated features in a documentation system can raise red flags if records in a data set are incomplete, missing, or highly anomalous.
  3. Prioritize AI Transparency. The Deloitte survey revealed how a lack of consumer trust in AI technology stems from a lack of understanding. Many consumers see AI decision-making as taking place within an indecipherable black box. Organizations can address AI trust and transparency by making it easy for non-technical consumers to see how data is collected and how the system makes decisions based on that data.

Supporting Trustworthy AI with TripleBlind

Trustworthy AI systems require access to large datasets. When a system is trained on a dependable and comprehensive data set, it is far less likely to be flawed or biased. Unfortunately, significant amounts of valuable data are trapped behind regulatory barriers and data silos.

Our highly innovative TripleBlind Solution supports the development of trustworthy AI by breaking down data silos and barriers. Through unprecedented privacy-enhancing technology, our clients are able to build AI systems they can confidently stand behind.

If you would like to learn more about how our technology supports AI TRiSM, check out our Blind AI Tools or download our Whitepaper. We remove common barriers to using high-quality data for artificial intelligence, solving key challenges AI professionals face with data access, bias, and prep. Through a combination of privacy-enhancing techniques, the TripleBlind Solution allows for training of new models on remote data –– without compromising the privacy or fidelity of sensitive data. Let us show you how by booking a live demo today.