Introduction

To inaugurate SAIL's blog, we thought there was no better topic to discuss than secure computation, and so what follows is a broad history and general introduction to the topic.

Secure computation is an umbrella term for an area of research within cryptography and computer science that aims to make use of data in a secure manner. The methods vary, but each can be categorized as either a cryptography-based or hardware-based approach. Nevertheless, all share the same ultimate goal: to perform computations in a provably secure manner.

Let's begin!

Fully Homomorphic Encryption

Fully Homomorphic Encryption (FHE) is the White Rabbit of cryptography. To understand its significance, we must first understand what it is.

FHE is a kind of encryption whereby a computation on any ciphertext returns a result identical to that same operation, as if it were performed on the unencrypted plaintext twin. The ability to do this results from a mathematical property known as homomorphism. The computation performed on the ciphertext is mapped to the plaintext, and so results in an equivalent answer. However simple this may be in theory, it is unimaginably difficult in practice. In fact, since the discovery of this dilemma back in 1978, no practical application has been developed, despite the considerable years devoted by researchers to tackling it.

Matthew Green has a great blog post diving into more technical detail.

Multi-party Computation

In 1982, Time magazine named the computer as the "Machine of the Year", much to Steve Jobs' dismay.[1] The Commodore 64 was launched in 1982, and Tron dazzled movie-goers with its state-of-the-art CGI.

That same year, away from the public spectacle of the personal computer revolution, a professor at Stanford put to paper a slice of cryptographic insight, igniting what would become a revolution in its own right. Andrew Yao introduced the Millionaires' Problem, an intuitive predicament with a difficult solution, which laid the foundation for the subfield of multi-party computation.

The Millionaires' Problem is simple: Alice and Bob wish to know who is richer without divulging exactly how much money each of them has — an admittedly curious problem. Yao formulated his original solution using a two-party computation approach; it wasn't until 1986 that he published a generalization.[2] Despite this achievement, it was still limited to two hypothetical individuals, restricting its real-world effectiveness. In a twist of fate not common to scientific endeavor, the next year a group of three computer scientists generalized to the multi-party case.[3] This whirlwind of discovery continued through to the mid 90s, before coming to a rest.

It wasn't until the turn of the century when the field began to pick up again. Up until that point, its practical value was hindered by high computational costs. Technological developments, alongside further protocol refinement, enabled secure multi-party computation to become practical. The first demonstration of a secure computation platform was by a cross-disciplinary team of Danish researchers in 2008.[4]

More recently, there has been much development in federated learning, an area of research born out of the intersection between AI/ML and MPC. The proliferation of more advanced methods of machine learning also begets privacy concerns. At MIT's Media Lab, for example, the Open Algorithms project is developing an open and secure platform with the aim of protecting the privacy of individual-level data.

This leads to a parallel, hardware-based approach towards secure computation: enclaves.

Secure Enclaves

Secure enclaves, also referred to as trusted execution environments, are systems isolated from the rest of a computer's hardware. They are not necessarily physically separated, as they may share silicon with the CPU, but instead are an entirely independent architecture.

Most people already know a secure enclave: Apple's Touch ID, and more recent Face ID. A user's biometric data is stored securely — in this case, on a dedicated core within the later A-series chips — and accessed only by interfacing directly with the enclave. For those curious, this BlackHat slide deck is seriously thorough.

Secure enclaves are certainly not limited to Apple. All the big players have offerings of their own: Intel, ARM and AMD. Samsung has developed their own solution to compete with Apple, and Nvidia released a FOSS stack, called TLK.

Intel Secure Guard Extensions (SGX) is the most refined and production-ready product of the three chip makers. Every Intel processor since the sixth generation Core series, released in 2015, has SGX integrated. This means that most data centers, and many consumers, have access to secure enclave technology without needing additional hardware. SAIL has developed our privacy-preserving machine learning platform using SGX, as the widespread nature of Intel CPUs means there is minimal friction for customers to adopt our technology as part of their healthcare IT infrastructure.

The secure enclave space is rife for disruption and we fully expect healthy competition to continue to enrich the public's access to privacy options.

Rest, Motion, and Use

Encryption can be compared along an axis of applicability, that is, under what circumstances is it useful? An intuitive way to grasp this concept is to consider states of data: at rest, in motion, and of use.

Rest

Data at rest is data in storage. The information is encrypted and left alone, much like how one would use a safety deposit box. This is the most well-known use of encryption, and often what the layperson thinks of when they are first introduced to the topic.

Motion

Data in motion is data being securely transported. The Secure Shell, "ssh", is every programmer's familiar interface to the cloud and the prime example of encryption used in this way. In the case of ssh, this is facilitated by public-key cryptography, where each user authenticates the other's key to verify identity. The sender encrypts data with the recipient's public key and sends it over the network. On arrival, the recipient decrypts the data using their private key — thus, it is a protocol designed for data in motion.

Use

Encrypted data that can be used simultaneously is like having your cake and eating it too; it's precisely what each of the aforementioned secure computation approaches looks to solve. Data in use is more than mere storage (rest) and transportation (motion), it is information that is being acted upon in some arbitrary way, while still under the protective effects of whatever security method is employed.

In Summation

Secure computation is a burgeoning field, and we invite you to follow down the rabbit hole yourself by clicking through any of the interwoven hyperlinks to learn more about those specific topics of interest to you.


  1. Walter Isaacson, Steve Jobs, (New York: Simon & Schuster, 2011), 139. ↩︎

  2. Andrew Yao, “How to generate and exchange secrets,” In Proceedings of the 27th Annual Symposium on Foundations of Computer Science, Washington, 1986, 162-167. https://doi.org/10.1109/SFCS.1986.25 ↩︎

  3. Goldreich and Micali and Widgerson, "How to play ANY mental game", In *Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing *, New York, 1987, 218-229. https://doi.org/10.1145/28395.28420 ↩︎

  4. Peter Bogetoft et al., "Multiparty Computation Goes Live", in Cryptology ePrint Archive, 2008. https://eprint.iacr.org/2008/068 ↩︎