Threat Modeling for Data Protection

When evaluating the security of an application and data model ask the questions:

  • What is the sensitivity of the data?
  • What are the regulatory, compliance, or privacy requirements for the data?
  • What is the attack vector that a data owner is hoping to mitigate?
  • What is the overall security posture of the environment, is it a hostile environment or a relatively trusted one?

Data When threat modeling, consider the following common scenarios:

Data at rest (“DAR”)

In information technology means inactive data that is stored physically in any digital form (e.g. database/data warehouses, spreadsheets, archives, tapes, off-site backups, mobile devices etc.).

  • Transparent Data Encryption (often abbreviated to TDE) is a technology employed by Microsoft SQL, IBM DB2 and Oracle to encrypt the “table-space” files in a database. TDE offers encryption at the file level. It solves the problem of protecting data at rest by encrypting databases both on the hard drive as well as on backup media. It does not protect data in motion DIM nor data in use DIU.
  • Mount-point encryption: This is another form of TDE is available for database systems which do not natively support table-space encryption. Several vendors offer mount-point encryption for Linux/Unix/Microsoft Windows file system mount-points. When a vendor does not support TDE, this type of encryption effectively encrypts the database table-space and stores the encryption keys separate from the file system. So, if the physical or logical storage medium is detached from the compute resource, the database table-space remains encrypted.

Data in Motion (“DIM”)

Data in motion considers the security of data that is being copied from one medium to another. Data in motion typically considers data being transmitted over a network transport. Web Applications represent common data in motion scenarios.

  • Transport Layer Security (TLS or SSL): is commonly used to encrypt internet protocol based network transports. TLS works by encrypting the internet layer 7 “application layer” packets of a given network stream using symmetric encryption.
  • Secure Shell/Secure File Transport (SSH, SCP, SFTP): SSH is a protocol used to securely login and access remote computers. SFTP runs over the SSH protocol (leveraging SSH security and authentication functionality) but is used for secure transfer of files. The SSH protocol utilizes public key cryptography to authenticate access to remote systems.
  • Virtual Private Networks (VPNs) A virtual private network (VPN) extends a private network across a public network, and enables users to send and receive data across shared or public networks as if their computing devices were directly connected to the private network.

Data in Use (“DIU”)

Data in use happens whenever a computer application reads data from a storage medium into volatile memory.

  • Full memory encryption: Encryption to prevent data visibility in the event of theft, loss, or unauthorized access or theft. This is commonly used to protect Data in Motion and Data at Rest. Encryption is increasingly recognized as an optimal method for protecting Data in Use. There have been multiple approaches to encrypt data in use within memory. Microsoft’s Xbox has a capability to provide memory encryption. A company Private Corepresently has a commercial software product cage to provide attestation along with full memory encryption for x86 servers.
  • RAM Enclaves: enable an enclave of protected data to be secured with encryption in RAM. Enclave data is encrypted while in RAM but available as clear text inside the CPU and CPU cache, when written to disk, when traversing networks etc. Intel Corporation has introduced the concept of “enclaves” as part of its Software Guard Extensions in technical papers published in 2013.
  • 2013 papers: from Workshop on Hardware and Architectural Support for Security and Privacy 2013
  • Innovative Instructions and Software Model for Isolated Execution
  • Innovative Technology for CPU Based Attestation and Sealing

Where do traditional data protection techniques fall short?

TDE: Database and mount point encryption both fall short of fully protecting data across the data’s entire lifecycle. For instance: TDE was designed to defend against theft of physical or virtual storage media only. An authorized system administrator, or and unauthorized user or process can gain access to sensitive data either by running a legitimate query and , or by scraping RAM. TDE does not provide granular access control to data at rest once the data has been mounted.

TLS/SCP/STFP/VPN, etc: TCP/IP Transport layer encryption also falls short of protecting data across the entire data lifecycle. For example, TLS does not protect data at rest or in use. Quite often TLS is only enabled on Internet facing application load balancers. Often TLS calls to web applications are plaintext on the datacenter or cloud side of the application load-balancer.

DIU: Memory encryption, Data in use full memory encryption falls short of protecting data across the entire data lifecycle. DIU techniques are cutting edge and not generally available. Commodity compute architecture has just begun to support memory encryption. With DIU memory encryption, data is only encrypted while in memory. Data is in plaintext while in the CPU, Cache, written to disk, and traversing network transports.

Complimentary or Alternative Approach: Tokenization

We need an alternative approach that address all the exposure gaps 100% of the time. In information security, we really want a defense in depth strategy. That is, we want layers of controls so that if a single layer is fails or is compromised another layer can compensate for the failure.

Tokenization and format preserving encryption are unique in the fact they protect sensitive data throughout the data lifecycle/across a data-flow. Tokenization and FPE are portable and remain in force across mixed technology stacks. Tokenization and Format preserving encryption do not share the same exposures as traditional data protection techniques.

How does this work? Fields of sensitive data are cryptographically transformed at the system of origin, that is during intake. A cryptographic transform of a sensitive field is applied, producing a non-sensitive token representation of the original data.

Tokenization, when applied to data security, is the process of substituting a sensitive data element with a non-sensitive equivalent, referred to as a token, that has no extrinsic or exploitable meaning or value. The token is a reference (i.e. identifier) that maps back to the sensitive data through a tokenization system.

Format preserving encryption takes this a step further and allows the data element to maintain its original format and data type. For instance, a 16-digit credit card number can be protected and the result is another 16-digit value. The value here is to reduce the overall impact of code changes to applications and databases while reducing the time to market of implementing end to end data protection.

In Closing

Use of tokenization or format preserving encryption to replace live data in systems results in minimized exposure of sensitive data to those applications, stores, people and processes. Replacing sensitive data results in reduced risk of compromise or accidental exposure and unauthorized access to sensitive data.

Applications can operate using tokens instead of live data, with the exception of a small number of trusted applications explicitly permitted to detokenize when strictly necessary for an approved business purpose. Moreover: in several cases removal of sensitive data from an organization’s applications, databases, business processes will result in reduced compliance and audit scope, resulting in significantly less complex and shorter audits.

This article was originally published in Very Good Security.