Thursday, August 27, 2015

52 Things: Number 47: What is the Fiat-Shamir transform?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. Having introduced Sigma Protocols, we now look at an important method for ensuring they are zero knowledge. Thanks again to David Bernhard for his assistance.

What is the Fiat-Shamir transform?

Sigma protocols, which we saw last week, are fast and useful protocols for Alice to prove something to Bob - as long as they're both online at the same time. Alice sends Bob a commitment, Bob replies with a challenge and Alice sends a response. Unfortunately bar further modifications, Sigma protocols are not actually known to be zero-knowledge: they are only honest-verifier zero- knowledge.

The Fiat-Shamir transform is a technique to turn a Sigma protocol into a non- interactive proof. This not only lets Alice send a proof to Bob by e-mail, which he can read later without having to send her a challenge back, it also lets her turn any Sigma protocol into a digital signature scheme with which she can assert "someone who knows the secret for this Sigma protocol has signed that message". Alice can create a signature once and post it to a usenet bulletin board and everyone who sees the signed message can check the signature without having to contact Alice. And zero-knowledge comes for free, since Bob or any other reader no longer has to do anything.

Although this technique is explained in Fiat and Shamir's 1986 paper, several eminent cryptographers have pointed out in the past that it was actually given by Blum in an even earlier work, though we have not been able to trace this.

A Sigma protocol can be implemented as four algorithms *Commit*, *Challenge*, *Respond* and *Check*, to be executed as follows:

Alice                                Bob
-----                                -----
    
co,st = Commit(secret,public)
         ---------- co --------->
                                     c = Challenge()
         <--------- c  ----------
r = Respond(st,c)
         ---------- r  --------->
                                         Check(co,c,r)

For the Fiat-Shamir transformation, Alice picks a hash function $H$ and uses it to create the challenge herself:

Alice                                World
-----                                -----

co, st = Commit(secret,public)
c = H(public,co)
r = Respond(st,c)
         ------ co,r ----------->
                                     c = H(public,co)
                                     Check(co,c,r)

If Alice wants to sign a message m, she includes that in the hash: c = H(public, co,m) and posts (m,co,r) as her signed message.

Why does this work? If $H$ were a random function, then the challenge is clearly uniformly random and independent of Alice's public information and commitment. The security analysis considers an Alice who does not have access to the code of $H$ directly, only to an oracle for $H$. In this case, the probability of Alice making a correct response without following the protocol (especially if she does not know the necessary secret) is proportional to the inverse of the size of the range of $H$, that is if $H$ has domain X and range Y then someone without the the secret who makes up to q $H$ -calls has at most a q/|Y| probability of making a r-value that *Check* accepts. Typically |Y| = 2^n for a decently large value of n, so this probability is negligible.

Some people will tell you that this style of analysis, known as the Random Oracle Model (ROM), is deeply flawed because there's an artificial counter- example of a scheme that is secure in the ROM but is insecure for any actual hash function $H$. What this counter-example shows is that if you go to enough effort to make a stupid scheme, you can end up with a stupid scheme. In practice, Fiat-Shamir has been known since at least 1986, used in several practical applications and stands unbroken to this day (if done properly). No- one has to date proposed a workable attack on a Fiat-Shamir transformed Sigma protocol that was not deliberately designed to be stupid, which is more than can be said for quite a few other cryptographic schemes.

Thursday, August 20, 2015

52 Things: Number 46: What do correctness, soundness and zero-knowledge mean in the context of a Sigma protocol?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. This week we enter the final phase, introducing Sigma Protocols as the first 'advanced protocol'. Thanks to David Bernhard for his assistance in writing this blog.

Sigma protocols

Sigma protocols are protocols for Alice to prove something to Bob (that she knows some secret). They have the following general form: Alice knows a secret; Alice and Bob both share some common information. Then:

  1. Alice sends a value to Bob, known as a commitment.
  2. Bob picks a challenge uniformly at random and sends it to Alice.
  3. Alice computes a response and sends it to Bob.
  4. Bob checks the response and accepts or rejects Alice's claim.
Legend has it that if you draw a diagram of this in some way, it looks like the Greek capital letter Sigma, hence the name Sigma protocol.

In cryptography, the properties we expect a Sigma protocol to support are:

  • Correctness If everyone does what they are supposed to, then Bob accepts.

  • Soundness If Alice lies, then Bob can tell (she can't trick him into accepting something false)

  • Zero-knoweldge If Alice tells the truth, Bob can't learn what her secret input was.

A more formal treatment

Having provided a rough overview, we now provide a more formal treatment, based on David's PhD thesis "Zero-Knowledge Proofs in Theory and Practice".

Defining a Sigma Protocol

Let $k$ be a field. We're interested in linear functions $f: W \to X$ from one $k$- vector space to another, where Alice and Bob both know some public $x \in X$ and Alice also knows a secret preimage $w \in W$ such that $f(w) = x$. Alice wants to prove to Bob that she knows a preimage of $x$.

Quite a lot of cryptography is done over elliptic curves nowadays. An elliptic curve is a group of points of the form $P = (x,y)$ and a special point "at infinity" that is an honorary member of the curve. These points satisfy some equations and most importantly, one can add two points on the curve to get another and this addition makes the curve into a group. One often starts with a large prime $p$, works over the base field $k = \mathbb F_p$ and considers points $P$ in $E_p = E \cap \mathbb F_p \times \mathbb F_p$.

Many elliptic curve protocols start out by generating keypairs using point multiplication: everyone agrees on a common, public base point $P$. Then one can pick a secret key $x \in \mathbb F_p$ and compute the related public key $Y = x \cdot P$. If Alice wants to register her public key somewhere, it is good for security if the registrar asks her to prove that she knows the associated secret key, or else she could register someone else's key as her own. But of course Alice should not reveal her secret key to the registrar. She could, for example, use a Sigma protocol to prove that she knows the secret key that she claims to know, without revealing it.

If one takes $W = \mathbb F_p$ as a $k$-vector space and $X = E_p$ then point multiplication for a fixed base point, specifically $f: W \to X, w \mapsto w \cdot P$ is a linear function. The same can be said for "matrix product" functions. Suppose that Alice has a secret key $x$ and a public key $Y = x \cdot P$, and someone sends her an ElGamal ciphertext $(C,D)$ for her key. She wants to decrypt this and prove that she has decrypted it correctly, one way to do this is to compute a decryption share $S = x \cdot C$. The decryption is then $D-S$ which anyone can compute from $(C,D)$ and $S$. So Alice want to show that she knows an $x$ meeting the two equations $Y = x \cdot P$ and $S = x \cdot C$, so we set $W = k, X = E_p \times E_p$ and $f(x) = (x \cdot P, x \cdot C)$ which is a linear function $f: W \to X$.

The Sigma protocol for $f: W \to X$ is the following construction. Alice knows $(x, w) \in X \times W$ with $f(w) = x$ and Bob knows $x$. 1. Alice picks $r \in W$ at random, sets $A = f(r)$ and sends $A$ to Bob. This is the commitment (Alice is committing to $r$). 2. Bob picks $c \in k$ at random and sends it to Alice. This is the challenge. 3. Alice computes the response $s = r + c \cdot w$ in $W$ (the product is scalar multiplication in $W$) and sends $s$ to Bob. 4. Bob accepts if $f(s) = A + c \cdot x$ in $X$.

Let's look at the three security properties for Sigma protocols in more detail:

Correctness

In just about any protocol (cryptographic or otherwise), correctness means that if everyone follows the protocol then it does what it should. In the context of Sigma protocols, this means that if Alice and Bob do as told then Bob accepts; this is true because $f$ is linear.

Soundness

Soundness means that Alice cannot prove a false statement. This trips up a lot of people because the first protocol they see is Schnorr's protocol for proving that $y = x \cdot P$, so Alice is proving that such an $x$ exists. But that is obvious! (Alice is also proving that she knows $x$, which is more interesting but that's another property.) But let's look at the other example, Alice proving that $S$ is a correct decryption share for $C$ under public key $Y$. Here, Alice is proving that an $x$ exists such that $Y = x \cdot P$ and $S = x \cdot C$ which is not true for all tuples $(P, C, Y, S)$. What's actually going on is that the image of $f$ is a one-dimensional subspace of the two-dimensional $k$- vector space $X$. In our formalism, soundness means that Bob does not accept (except with perhaps negligible probability) unless $x$ is in the image of $f$ (that is, a preimage $w$ exists such that $x = f(w)$).

Sigma protocols are sound. Actually, they are even more than that, they have a property called special soundness. Informally, consider the point in time when Alice has just sent her commitment $A$ to Bob. For which values of $c$ can Alice possibly find a $r$ that Bob will accept? If there is at most one such value then Alice gets Bob to accept overall with probability at most $1/|k|$. Usually, $|k|$ is exponentially large (it's the prime $p$ that we started with) so $1/|k|$ is negligibly small. Special soundness says that if Alice can conveince Bob even for only two out of $|k|$ possible challenges, then a preimage $w$ must exist. Suppose that on challenge $c$ Alice would reply $s$ and on challenge $c' \neq c$ Alice would reply $s'$, and Bob would accept both of these. Then set $d = (c-c')^{-1}$ which we can do as $k$ is a field and $c \neq c'$, and a bit of linear algebra shows that $w = d \cdot (s-s')$ where the product is again $W$-scalar multiplication satisfies the equation $x = f(w)$.

Zero-Knowledge

Bob is happy because the protocol is sound. Alice still needs to know that Bob can't learn $w$ from the Sigma protocol. Actually, Alice probably wants even more than that: after running the protocol with Bob, Bob should not be able to prove to Charlie that he knows Alice's secret $w$ either. Zero-knowledge says even more than that, Bob does not learn anything from the protocol except that Alice knows $w$ (and therefore, that $w$ exists and $x$ is in the image of $f$).

The proof that Sigma protocols are zero-knowledge ... does not exist! Contrary to what one might guess after learning about Sigma protocols in the textbook chapter on zero-knowledge, Sigma protocols in general are not zero-knowledge, which beginning students of cryptography would do well to remember for their exams. (They meet a weaker requirement called honest-verifier zero-knowledge.)

Discussing Sigma protocols in the context of zero-knowledge isn't completely arbitrary though: one can make them zero-knowlege in several ways, the most practical of which is making them non-interactive. But that's next week's topic...

Friday, August 14, 2015

52 Things: Number 45: Describe some basic (maybe ineffective) defences against side channel attacks proposed in the literature for RSA.


This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. This week we consider what can be done to mitigate the threat of side-channels against RSA implementations...


To keep things simple in this post, we'll talk about so-called "vanilla" RSA (where no randomness is used in encryption) and highlight a small number of potential side-channel attacks along with countermeasures.

Let's start by recalling the vanilla RSA encryption scheme.
Key generation picks a secret pair of odd prime integers, $p$ and $q$, and computes the modulus $N = pq$. An integer $3 \leq e \leq \phi(N)$, coprime to $\phi(N)$, is chosen. Then the public key is the pair $\left(N, e\right)$ and the secret key is the unique integer $3 \leq d \leq \phi(N)$ such that $ed \equiv 1 \: \left( \mathrm{mod} \: \phi(N) \right)$.
To encrypt a message $m \in \mathbb{Z}_N$, one computes $m^e \: \left( \mathrm{mod} \: N \right)$.
To decrypt a ciphertext $c \in \mathbb{Z}_N$, one computes $c^d \: \left( \mathrm{mod} \: N \right)$.

An SPA-Style Attack to Determine the Secret Key

We first give an example of how information about the secret key $d$ could be leaked during the decryption operation. A typical implementation of exponentiation will be a branching program that behaves differently depending on the inputs. Consider, for example, the square and multiply algorithm which efficiently computes an exponentiation where the exponent is expressed in binary:

Say $d = \sum_{0\leq i \leq n} b_i 2^i$ where each $b_i \in \lbrace 0,1 \rbrace$. Then,
\[ c^d = \prod_{0 \leq i \leq n} c^{b_i 2^i }.\] Noting that, if we ignore the bits $b_i$, each factor in the product is the square of the previous factor, we can easily compute the product as follows:
  • $ANS \leftarrow 1$
  • $fac \leftarrow c$
  • For $0 \leq i \leq n$, do
    • If $b_i = 1$,
      • $ANS \leftarrow ANS \times fac$
      • $fac \leftarrow fac^2$
    • Else,
      • $fac \leftarrow fac^2$
  • Return $ANS$.

The algorithm behaves differently depending on whether each $b_i$ is $0$ or $1$. Therefore, if this algorithm is used to decrypt an RSA ciphertext, the time it takes or its power consumption could reveal the value of each bit $b_i$, thus revealing the secret key $d$. This would be an SPA-style attack since only one trace would be needed.

In order to prevent this kind of attack, one must make the two branches of the algorithm look the same to an attacker, i.e. have both branches of the square and multiply algorithm take the same amount of time to run or consume the same amount of power.

An SPA-Style Attack to Determine the Plaintext

The above shows how the exponentiation operation used in decryption could compromise the secret key. But an attacker may be interested in a particular plaintext, $m$ (after all, encryption is designed to keep such messages secret). Again, if the encryption operation is a branching program which depends on the value of $m$, the runtime or power consumption of a single encryption could leak information about $m$ in an SPA-style attack. In particular, note that one has to perform a modular reduction in encryption. In most implementations, instead of a single reduction of a very large integer at the end of the exponentiation, there will be many modular reductions throughout the exponentiation algorithm in order to keep the numbers involved (relatively) small. As a slightly contrived example, suppose we perform modular reduction via the following loop:

  • While $ANS > N$
    • $ANS \leftarrow ANS - N$
which, since the exponent is known in the case of encryption, leaks information about the base $m$ depending on how long it takes to run and how much power it consumes (cf. David's post about attacks on Montgomery Arithmetic).

Again, to prevent this kind of attack we must ensure that our programme takes the same amount of time and consumes the same amount of power to reduce intermediate values modulo $N$, no matter what size they are (up to some upper bound which can easily be found since we know the exact exponent and range of values for the base).

Preventing DPA-Style Attacks on the Secret Key

Even if we obscure any branching in decryption that depends on $d$, the precise details of the operations performed in carrying out the exponentiations for decryption will still depend (in a less obvious way) on the exponent. So, over multiple decryptions, a statistical relationship between the decryption key and the duration or power consumption of the operations may emerge. Therefore we also need to prevent more subtle DPA-style attacks where the attacker uses statistical techniques on a large number of traces to test hypotheses on the secrect key.

To do this, we have to remove the direct dependency between the secrect key and the calculation performed each time. This involves blinding, where some random noise is injected into the exponentiation operation without affecting the result. In the case of decryption, we introduce randomness in the exponent: while $d$ is the unique multiplicative inverse of $e$ in $\mathbb{Z}_{\phi(N)}$ such that $3 \leq d \leq \phi(N)$, we can add or subtract integer multiples of $\phi(N)$ to $d$ and obtain a new inverse. So to decrypt $c \in \mathbb{Z}_N$, select a random $r \in \mathbb{Z}$ and compute $d' = d + r\phi(N)$. Then compute the message $c^{d'} \: \left( \mathrm{mod} \: N \right)$ which will be the same as directly computing $c^d \: \left( \mathrm{mod} \: N \right)$. The point is that addition is not usually a branching operation, so adding $r\phi(N)$ to $d$ will not leak information about $d$ on a single trace, and using a new random $r$ for each decryption prevents DPA-style attacks.

Coppersmith's SPA-Style Attack to Determine the Plaintext

We conclude this post with a special and interesting attack to determine $m$ which is only possible when $e$ is small ($e=3$ is a popular choice of exponent for efficiency of encryption). There is a theorem due to Coppersmith (see this article) that an attacker can efficiently find all 'small' integer roots modulo $N$ of an integer polynomial $f$ of degree $e$, where small essentially means having absolute value less than $N^{1/e}$. Obviously if $m$ happens to be small then one can use this to directly solve the equation $m^e \equiv c \: \left (\mathrm{mod} \: N \right)$ as required to recover the plaintext. If not, but some of the most significant bits of $m$ are leaked, then we may write $m = m_k + m_u$ where $m_k$ is known and $m_u$ is small, obtaining an integer polynomial $f(X) = \left( m_k + X \right)^e - c$ whose small roots modulo $N$ can be found via the Coppersmith method and correspond to the remaining bits of $m$. So we need to make sure that bits of $m$ are not leaked by the encryption operation.

To counter this kind of attack, one again uses blinding: we introduce some randomness to $m$ before exponentiating and remove it again afterwards. More precisely, to encrypt $m \in \mathbb{Z}_N$, we select a random $r \in \mathbb{Z}_N$, compute $rm \: \left( \mathrm{mod} \: N \right) $, then $\left(rm\right)^e \: \left( \mathrm{mod} \: N \right)$ and finally multiply the result by $r^{\phi(N)-e}$ (and reduce modulo $N$ again). Obviously the ciphertext is the same as it would have been without blinding, but the leakage of the exponentiation operation is now independent of $m$.

This should give you a flavour of the kind of side-channel attacks that can be mounted on an encryption scheme and the way implementations can avoid them.

Friday, August 7, 2015

52 Things: Number 44: Describe some basic (maybe ineffective) defences against side channel attacks proposed in the literature for ECC.

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. This week we consider what can be done to mitigate the threat of side-channels against  ECC implementations...

In this blog post we will discuss "some basic (maybe ineffective) defences against side channel attacks proposed in the literature for ECC". This can be seen as a complement to last weeks blog which asked the same question for AES.

Before we start the discussion, I want to clarify what kind of countermeasures we will be considering. From this point forward we will only be considering implementation level countermeasures, I will not consider hardware countermeasures such as Dual Rail Logic, or location security such as putting it in a concrete box. While the title says "maybe ineffective" I will stick to designs that at least have some hope of working, for example wearing a tinfoil hat will not secure my credit card and will clearly not work and so will not be discussed.

Elliptic curve cryptography as a rule is reasonably good when it comes to resisting side channel attacks but there are still some points that are worth considering.

Scalar Multiplication
As with most cryptography scalar multiplication (normally exponentiation in other schemes) is a very leaky operation, this is well studied in RSA. This is no different in elliptic curve cryptography because the addition operator and the double operator behave differently. Various techniques that can be applied to RSA can also be applied here, such as exponent blinding where for each scalar multiplication you choose a value $r$ such that $[a]P=[a+r]P$ where $a$ is the value you require to keep secret and $P$ is a generator of the curve. Since scalar multiplication only leaks information about the scalar this technique only needs to be applied when you want to keep the scalar secret. Recently there has been work to create elliptic curves which have the same operation for double and add which would resolve this issue.

Is a point on the curve?
Sometimes an $x$ value is chosen and to learn if it is on the curve you use the Jacobi symbol to learn if $x^3+a\cdot x+b$ is square. If it is $(x,y)$ is an elliptic curve point. As can be seen by the algorithm in the link, the process of calculating the Jacobi symbol is variable length and thus may leak information about the secret value $x^3+a\cdot x+b$. Since we are only interesed if $x^3+a\cdot x+b$ is square, we note that $x^3+a\cdot x+b$ is square if and only if $r^2\cdot(x^3+a\cdot x+b)$ is, for random $r$. Using this technique we can check if $x$ is a valid point on the curve but since it has been blinded by a random $r$ this will not leak anything about the underlying point.

Theoretically secure
While against known side channel attacks elliptic curves are reasonably secure without much help, it is possible to secret share certain schemes to enhance the security. Providing that each share leaks independently it is possible to create schemes which are provably secure against arbitrary leakage functions (including ones which can only happen in theory and not in practice). This area of cryptography has become known as Leakage Resilient Cryptography.

Thursday, July 30, 2015

52 Things: Number 43: Describe some basic (maybe ineffective) defences against side channel attacks proposed in the literature for AES

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. This week we consider what can be done to mitigate the thread of side-channels against AES implementations...
Sidechannel defences: Why?
For a modern cryptographic scheme to be taken seriously, we generally require some form of security justification. In the case of AES, we believe it acts like a random permutation if the adversary doesn't know the key. However, if the adversary has side-channel information, it is possible that this is no longer the case. So, what can we do about it?  Ideally we would create an implementation that is completely impervious to side-channel attacks. However, this effectively means the implementation must sit in total isolation, with absolutely no output streams - which makes it rather pointless!
Perhaps we can make sure that whatever we do, it doesn't matter if the AES implementation leaks information through side channels? This leads to the field of leakage resilient cryptography, which is a very strong security requirement indeed. As such, schemes secure in these conditions (of which there are very few) tend to be drastically less efficient than those which avoid (/ignore) the problem. Since trade-offs must always be made in design, in practice we tend to use schemes that assume AES doesn't leak any information, and combine them with implementations that contain defences against some of the simpler side-channel attacks. The hope is that this pushes the attack cost beyond the value of the information secured and so (even though they could) no adversary will bother attacking the system because it is no longer viable.

Some Basic Defences
So, with that in mind, let us consider a couple of basic defences against some of the less sophisticated side-channel attacks. As pointed out in the question, these techniques may be easily bypassed, so think of this post as explaining the general concept, rather than providing any sensible suggestions!

Attack: Timing Attack
Weakness: Some implementations run in different times depending on their inputs. As such, by observing how long the system takes to respond one learns something about the key/inputs.
Defence: Constant time implementations. As the title says, the best defence against a timing attack is to make sure the implementation takes constant time, and most implementations will be constant-time nowadays. This might not be too hard in hardware, but in software it can be very difficult indeed, because the microcode (the internal programming of the processor) is generally a trade secret and open to change.

Attack: Power Analysis (DPA,SPA)
Weakness: The power usage of some implementations is correlated with the key material, often due to the hamming distance when storing values. For more information, read the blog from two weeks ago.
Defence (1): Masking Rather than using the AES Sbox directly, one applies a mask to the input value, and looks this up in a masked Sbox (which is effectively the Sbox with its values reordered to accommodate the mask).  Then, even though the attacker may be able to detect correlations between certain internal variables, these are all masked, and do not [directly] correspond to the key material as they had previously. More complex masking schemes will be more complex to instantiate, but lead to better resistance to attack.
Defence (2): Shuffling To conduct a power analysis attack, the attacker uses the fact they know the internal structure of the AES scheme. If we shuffle the order of the S-boxes in our implementation (by some secret permutation), the adversary will not know how their readings correspond to the internal key materials. A variation on this is the deliberate use of non-determinism, allowing the processor to reorder certain collections of instructions on it's own.


Attack: Cache-flooding
Weakness: Implementations using lookup-tables (eg for the SBox) will be more or less efficient if the appropriate cells are already in the processors cache. By pushing most of the lookup table out of the cache, an attacker can observe whether the appropriate cells are called, leaking information. Can also be observed as a timing attack or by power analysis if the cost of loading the cache can be observed.
Defence: Dont use lookup tables on secret data! The simplest defence in this list - if you don't want to leak information about which lookup entries have been used, don't use lookup tables. In the case of AES this is reasonable, because the AES Sbox can actually be computed as a simple function on the input byte. This would be much less practical with (for example) the DES Sbox, which does not have this structure.






Friday, July 24, 2015

52 Things: Number 42: Look at your C code for Montgomery multiplication above; can you determine where it could leak side channel information?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. In this post (spoiler alert!) we enumerate various flavours of side-channel attacks.

A few months ago (back in March) you may remember I posted for the 23rd of the 52 things in this series (can be found here). The article was entitled “Write a C program to implement Montgomery arithmetic” and contained parts of an implementation to do this. In this article we are going to look at this implementation and see how it could leak information to give us a practical idea of what leakage may look like.

Before progressing however I would like to remind you of my last post which examined the difference between SPA and DPA attacks. You will remember from there that SPA attacks use a single or very few traces and work by spotting trends in the pattern (such as timing or instruction sequences) while DPA attacks use many traces and aim to derive intermediate vales of an algorithm and thus the secret information by using hypotheses of the secret data and a correct leakage model. Before looking at the Montgomery multiplication algorithm then it is worth stating from the outset that if hypotheses of secret data and corresponding leakage models can be derived for the algorithm, DPA style attacks can be used to derive intermediate values which will mean that the algorithm will leak data being processed. If this data is therefore secret, we can already say that this algorithm is going to leak through using DPA style attacks.

As mentioned in my last post however, we know that DPA style attacks are significantly harder than SPA as they require the ability to form hypotheses of secret data, more traces and their success will depend significantly on the noise produced by the device. The question then is how our implementation will leak using SPA attacks where things such as timing or sequences of instructions being executed can be exploited. To understand this lets look at each of the four stages I presented for the algorithm last time separately. Things we are interested in are things such as conditional statements or loops as this may be exploited for timing attacks.

1. The GCD Operation

This section uses the binary extended gcd operation to find the integers r-1 and m’ such that rr-1 = 1 + mm’. If we assume therefore that our algorithm for the extended gcd operation runs in constant time then we can assume that this operation is safe.

 

2. Transform the Multipliers

This stage computes abar = ar mod m and bbar = br mod m and therefore as this is a simple operation, it is unlikely to leak provided the operations and instructions required (such as the multiplier) run in constant time.

 

3. Montgomery Multiplication

This is the main section of the algorithm. The first sub-stage to of the function is to calculate t = abar*bbar and in the same way as the previous two stages we can assume no leakage.

The second stage however is the computation of u = (t + (tm’ mod r)m)/r and it is here that we encounter two conditional statements the first i:

if (ulo < tlo) uhi = uhi + 1;

which allows for a carry as we have a 128 bit implementation on a 64 bit architecture (see article for more information). Observing the time of execution or a power trace will therefore give insight into whether or not this conditional was executed and therefore whether or not these ulo was higher than tlo.

Again we have a second conditional statement which tests if u >= m.

if (ov > 0 || ulo >= m)
ulo = ulo - m;

This will also reveal whether either of these conditions is true and thus leak information about the values being worked on.

4. The Inverse Transformation

Here we compute ur-1 mod m which is the product of a and b modulo m. In the same way as stages 1 and 2 did not leak we can also say that this stage will not leak.

Friday, July 17, 2015

52 Things: Number 41: Are all side-channels related to power analysis?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. In this post (spoiler alert!) we enumerate various flavours of side-channel attacks.

Right, shall we keep this one simpleSide-Channel Attacks (SCA) utilise information acquired in the context of physical implementations of cryptographic algorithms, as opposed to classical cryptanalytic attacks which target theoretical weaknesses. Power analysis is perhaps the most popular flavour of SCA, but it is certainly not the only one.

It would be beyond the scope of this blog to provide a comprehensive list of all side-channels and attack methodologies. Instead, here are some of the most common SCA targets, together with references to simple but clever attacks:

  • Power consumption. The instantaneous power consumption of a device can leak information about the  processed value, e.g. its Hamming weight. I recommend Mangard's attack on the AES key schedule as a starting point for those interested in Simple Power Analysis (SPA). 
  • Electromagnetic radiation. Apparently, it's rather tricky to get the measurements right for this one, but once that's done -- the attack methodology is similar to Power Attacks. Here's the most cited paper dealing with EMR.
  • Other. There is no limit to what can constitute a target for SCA, and that's in part why they are so interesting. Here's some more ideas: 
    • an attack that uses visible light emitted from computer LED (light-emitting diodes),
    • an attack exploiting error messages, also known as padding oracle attack

Writing the above, I realise how it could be unsettling to know or to find out that there are so many loopholes. SCA is very much a cat-and-mouse game, and researchers usually recommend ways to avoid the signaled vulnerabilities.