Tuesday, June 7, 2016

From IND-CPA to Robust AEAD: Security Notions for Symmetric Encryption

Edit 09/06/16: When I originally wrote this post, I was under the impression that the Robust AE notion had been widely adopted in the symmetric community. Since this is not yet the case, I have edited the post to emphasise that this is a new notion of Rogaway et al. and not a standard one.

This week I'm at the Summer School on Real-World Crypto and Privacy in a beautiful beach resort near Šibenik, Croatia. Obviously spending time by the hotel pool / Mediterranean sea is terribly dull and exhausting, so I was delighted that today I could instead attend a two-hour lecture by Phil Rogaway on security notions for symmetric encryption.

In all seriousness, Phil gave an excellent tour of the (recent) history of symmetric encryption research and I'll try to sketch some of the key points here.

OK, let's first recall the classical notions for symmetric encryption. We assume that, for any key $k$, encryption is a probabilistic function $E_k$ from the message space $\mathcal{M}$ to the ciphertext space $\mathcal{C}$ and that decryption is a deterministic function $D_k: \mathcal{C} \to \mathcal{M}$. Then the advantage of an adversary is the adversary's ability to distinguish between the oracles $E_k(\cdot)$ and $E_k(\$(\cdot))$, where $\$(\cdot)$ returns a random string of the same length as the input. This is called IND-CPA security.

The above notion captures privacy, but not authenticity. But in practice, people want to use a single encryption scheme to achieve both of these goals. So the security definition evolved to probabilistic Authenticated Encryption (pAE): first we allow the decryption function to return $\bot$, meaning the ciphertext is inauthentic. Then we define the privacy advantage just as in the IND-CPA game, but additionally consider the authenticity advantage, which is the probability that an adversary with access to $E_k(\cdot)$ outputs a forgery: a ciphertext that did not come from the encryption oracle and that is decrypted by $D_k$ to something other than $\bot$.

This definition still has practical problems. Firstly, it still requires probabilistic encryption and - as we all know - truly random coins are hard to come by. So we want to swap true randomness with nonces: numbers that should only be used once. Then encryption can be implemented using a deterministic function that takes a nonce as an extra input. Secondly, practitioners often need to transmit some header data along with the ciphertext, and this associated data needs to be authenticated but not encrypted. So here's yet another definition(!): a nonce-based AEAD scheme is a function that takes a key, a nonce, some associated data and a message and returns a ciphertext. The authenticity notion is defined essentially as in pAE, but the privacy notion is strengthened: the adversary is asked to distinguish between $E_k(\cdot,\cdot,\cdot)$ and $\$(E_k(\cdot,\cdot,\cdot)),$ i.e. the ciphertexts need to look like random strings, not just encryptions of random strings. These definitions assume that the adversary is nonce-respecting: the adversary never repeats the nonce in their encryption queries.

Of course, nonces are not always used just once, as we would like. To move even closer to the messy world of practical security, we introduce Misuse-Resistant Authenticated Encryption (MRAE) where the adversary can repeat nonces, as long as they don't repeat the entire nonce-data-message triple (otherwise they could trivially distinguish the corresponding identical ciphertexts from independent random strings).

"Surely that's the last one!", I hear you cry. Not quite. It's easy to show that MRAE requires significant ciphertext expansion: ciphertexts must be significantly longer than the plaintexts. This is a problem in certain lightweight applications like in the Internet of Things. So, accepting that there's a tradeoff between the amount of ciphertext expansion and the level of security, Phil and others have recently introduced robust AE (RAE), where the encryption function now has an additional argument specifying the amount of ciphertext expansion. One then tries to obtain the best possible security with that amount of expansion. I'll omit the details in the interest of space.

What was most interesting about Phil's talk was to learn how the evolution of the theoretical notions of security was driven by practical considerations. On the other hand, since the security goalposts seem to be moving all the time, perhaps practitioners will just stop trying to reach them.

Workshop on the Theory and Practice of Secure Multi-Party Computation 2016: Efficient Constant-Round Multiparty Computation

Inspired by Yehuda's talk, I decided to render the history 'flow' which lead today to efficient MPC against malicious parties with a $O(1)$ number of rounds and ending with descriptions of some current state of the art techniques.

First, let's begin with Yao's protocol [1]: $2$ parties - Alice and Bob - hold a function $f$ and they want to compute the output of $f$ with secret inputs. How is this possible? For this we introduce the notion of garbling a circuit of some $f$, $\mathcal{C}_f$ by writing $G(\mathcal{C}_f)$. A nice picture of how this process goes is found in these slides! After you have finished with the nice pictures, you can come back to see an example:

Suppose $f$ is a simple function which computes the addition of $2$ bits, with Alice's input 0, Bob's input 1. Alice computes the garbled circuit of $f$, $G(\mathcal{C}_f)$ and sends it to Bob along with the key corresponding to her garbled input: $k_{A_0}$. In order for Bob to compute the output of $f$ he needs the key corresponding to his input $k_{B_1}$, but he doesn't want for Alice to see his input. Luckily, there is a tool called OT (oblivious transfer) so the problem is solved.

Now, for the general case we can do all of these in $5$ rounds of communication: Alice sends to Bob $G(\mathcal{C}_f)$ (1), Bob does all the OT's in parallel (3) and at the end sends the output back to Alice (1). Great, let's go home now...well, not so fast: what if Charlie wants to join the party? Or Alice garbles a different circuit? Apparently we need something else for $3,4,\dots$ party computation.

Another approach appeared in 1990 and it's called the BMR protocol [2]. Assuming we have access to a protocol which generates random shared coins within $O(1)$ rounds of communication. BMR uses the coin generation to construct 'super-seeds' for wires and bits to mask parties inputs. Then, for each output wire is applied an one time pad using outputs from a PRG seeded by the input wire coins. Then some equations arise by treating individually each outcome of the gate depending on - $4$ possible - input wire labels $A_g, B_g, C_g, D_g$, but for more details we recommend the reader to go through the original article [2].

In 2015 it was published at Crypto the SPDZ-BMR variant [3] which in a nutshell replaces the random coins with calls to the SPDZ random functionality transforming for the first time BMR into a constant round protocol resistant to dishonest majority. Of course, there are many other security subtleties and optimizations presented in [3] such as using PRF's in prime fields instead PRG's in binary fields. SPDZ-BMR is organized in 2 phase offline step:
  1. Key wire distribution and input bit mask generation.
  2. Compute the wire labels $A_g, B_g, C_g, D_g$ of circuit $\mathcal{C}_f$.
The first step of the offline phase is very similar to the SPDZ offline phase for generating triples, bits, squares which will be later used in the online phase as well as in the step 2) of the offline phase! The computation of wire labels can be seen as evaluating a circuit in the MPC land with constant depth.

In the online phase the labels are revealed to all parties and then each locally computes the outcome of $G(\mathcal{C}_f)$. These all look nice but the scheme's major drawback is that it's cubic in the number of players which means $30$ parties is already too much. In these cases, GMW online phase based - like SPDZ - outperform the BMR in low circuit depth setting. The latest experiment using [4] proves that a vickrey auction with $100$ parties is practical.

In [5] the depth of the circuit shrunk with a slightly more computational overhead using a variant of the Gentry FHE-MPC protocol to only 3! Also the overall time complexity is now quadratic in the number of players.

As Yehuda said in his talk these approaches to constant round protocols are like 'low hanging fruits'. As a researcher, are you ready to pick one by combining different things a novel way? Can you find a scheme which is linear complexity in the number of players?

[1] Yao, Andrew C. "Protocols for secure computations." Foundations of Computer Science, 1982. SFCS'08. 23rd Annual Symposium on. IEEE, 1982.

[2] Beaver, Donald, Silvio Micali, and Phillip Rogaway. "The round complexity of secure protocols." Proceedings of the twenty-second annual ACM symposium on Theory of computing. ACM, 1990.

[3] Lindell, Yehuda, et al. "Efficient Constant Round Multi-Party Computation Combining BMR and SPDZ." Advances in Cryptology--CRYPTO 2015. Springer Berlin Heidelberg, 2015. 319-338.

[4] Keller, Marcel, Emmanuela Orsini, and Peter Scholl. MASCOT: Faster Malicious Arithmetic Secure Computation with Oblivious Transfer. Cryptology ePrint Archive, 2016. http://eprint.iacr.org/2016/505 .

[5] Lindell, Yehuda, Nigel P. Smart, and Eduardo Soria-Vazquez. "More Efficient Constant-Round Multi-Party Computation from BMR and SHE."