The Growing Value of Data Masking

As organizations collect, store, and process increasing amounts of sensitive and valuable information, they need methods to protect it from exposure. The value of organizations’ data troves makes them a prime target for hackers. Customers’ personal data can be sold on the black market for use in other crimes, and an organization’s intellectual property can be sold to competitors to make research and development quicker and more efficient.

The simplest way to protect sensitive data from exposure is to store it offline where it is inaccessible to cyber attackers. However, this solution also makes the data unusable for the organization itself. As a result, organizations are often seeking a way to balance data security and usability.

What is Data Masking?

Data masking is one of several different means by which an organization can obfuscate sensitive data. Each method of data protection has its pros and cons and falls at a different place along the security/usability continuum.

The two extremes of this continuum are using data in plaintext and performing data encryption. Plaintext data is readable to anyone, which makes it extremely usable; however, if an application with access to the data has a vulnerability that an attacker can exploit, then the data may be leaked. Encryption, on the other hand, provides complete protection since encrypted data cannot be read without access to a secret key. However, since this is accomplished by transforming the data into random noise, the encrypted data is useless for most purposes. Applications must transform the data back to plaintext in order to use it, which makes it potentially vulnerable to disclosure.

Data masking falls between the two extremes of the security/usability continuum. The data is obfuscated by applying a masking algorithm that can be tailored to the sensitivity and context of the data. Less sensitive data in a secure environment may have a deterministic masking algorithm applied that retains many of the usable features of the data. More sensitive data, on the other hand, may have fully-random masking applied to minimize the probability of exposure. The flexibility of data masking makes it an ideal solution for obfuscation of sensitive data.

The Growing Importance of Data Obfuscation

Organizations that collect sensitive data must balance the benefits and potential value of the data with the risks associated with storing it. The larger the data trove, the more useful it is to the organization and the more probable that the organization will be targeted by cybercriminals for attack.

Many organizations base their business plans off of the collection and processing of large amounts of sensitive data. Social media platforms like Facebook collect data about their users, process it to glean insights about those users, and then sell the data or the insights that they have generated to third-parties to use in their own data processing or to provide targeted advertising. Despite numerous scandals regarding how Facebook collects, uses, and secures customer data, users largely have not left the platform, and it remains extremely profitable by selling its access to consumer data to third parties.

However, the large number of data breaches in recent years has generated a new focus and interest in data protection. The EU’s General Data Protection Regulation (GDPR) is the first of several new data protection regulations designed to protect customers against misuse and exposure of their personal data by companies. Under the GDPR, the definition of sensitive data was expanded relative to past regulations, more organizations were subject to the regulation, and the fines for non-compliance increased dramatically. GDPR also required similar data protection rules for countries or organizations wishing to use EU citizen data, sparking a wave of similar regulations around the world.

GDPR and similar regulations are designed to protect data that can be used to uniquely identify an individual from exposure. The same protections do not apply to data that has been sufficiently obfuscated to make this impossible. While many data anonymization solutions in active use are insufficient to meet the requirements of the GDPR, effective data masking algorithms can allow organizations to usefully process obfuscated data while remaining compliant with the GDPR and other data privacy laws.

The Increasing Demand for Data Masking

The landscape of data protection regulations is growing and is likely to continue doing so for some time. Organizations that collect, process, store, or transmit customer data that is protected under the regulations can expect to be required to comply with increasingly stringent limits on when and how this data can be used.

Since data protection regulations do not apply to properly anonymized and obfuscated data, the market for effective data masking solutions is expected to grow rapidly. Between 2019 and 2027, the market for data masking solutions is expected to achieve a compound annual growth rate (CAGR) of 14.8%.

Organizations looking for an effective data obfuscation solution that leaves data still usable for processing in untrusted environments (development, quality assurance, etc.) need a data masking solution that is scalable and configurable to allow masking algorithms to meet the needs of the sensitive data in question. Since organizations process a wide variety of sensitive data, the ability to automatically detect and obfuscate instances of protected data (based upon the requirements of applicable regulations) using algorithms configurable to the data and environment in question can dramatically simplify the process of achieving and maintaining compliance with current (and future) data protection regulations.