Thursday, July 30, 2015

52 Things: Number 43: Describe some basic (maybe ineffective) defences against side channel attacks proposed in the literature for AES

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. This week we consider what can be done to mitigate the thread of side-channels against AES implementations...
Sidechannel defences: Why?
For a modern cryptographic scheme to be taken seriously, we generally require some form of security justification. In the case of AES, we believe it acts like a random permutation if the adversary doesn't know the key. However, if the adversary has side-channel information, it is possible that this is no longer the case. So, what can we do about it?  Ideally we would create an implementation that is completely impervious to side-channel attacks. However, this effectively means the implementation must sit in total isolation, with absolutely no output streams - which makes it rather pointless!
Perhaps we can make sure that whatever we do, it doesn't matter if the AES implementation leaks information through side channels? This leads to the field of leakage resilient cryptography, which is a very strong security requirement indeed. As such, schemes secure in these conditions (of which there are very few) tend to be drastically less efficient than those which avoid (/ignore) the problem. Since trade-offs must always be made in design, in practice we tend to use schemes that assume AES doesn't leak any information, and combine them with implementations that contain defences against some of the simpler side-channel attacks. The hope is that this pushes the attack cost beyond the value of the information secured and so (even though they could) no adversary will bother attacking the system because it is no longer viable.

Some Basic Defences
So, with that in mind, let us consider a couple of basic defences against some of the less sophisticated side-channel attacks. As pointed out in the question, these techniques may be easily bypassed, so think of this post as explaining the general concept, rather than providing any sensible suggestions!

Attack: Timing Attack
Weakness: Some implementations run in different times depending on their inputs. As such, by observing how long the system takes to respond one learns something about the key/inputs.
Defence: Constant time implementations. As the title says, the best defence against a timing attack is to make sure the implementation takes constant time, and most implementations will be constant-time nowadays. This might not be too hard in hardware, but in software it can be very difficult indeed, because the microcode (the internal programming of the processor) is generally a trade secret and open to change.

Attack: Power Analysis (DPA,SPA)
Weakness: The power usage of some implementations is correlated with the key material, often due to the hamming distance when storing values. For more information, read the blog from two weeks ago.
Defence (1): Masking Rather than using the AES Sbox directly, one applies a mask to the input value, and looks this up in a masked Sbox (which is effectively the Sbox with its values reordered to accommodate the mask).  Then, even though the attacker may be able to detect correlations between certain internal variables, these are all masked, and do not [directly] correspond to the key material as they had previously. More complex masking schemes will be more complex to instantiate, but lead to better resistance to attack.
Defence (2): Shuffling To conduct a power analysis attack, the attacker uses the fact they know the internal structure of the AES scheme. If we shuffle the order of the S-boxes in our implementation (by some secret permutation), the adversary will not know how their readings correspond to the internal key materials. A variation on this is the deliberate use of non-determinism, allowing the processor to reorder certain collections of instructions on it's own.


Attack: Cache-flooding
Weakness: Implementations using lookup-tables (eg for the SBox) will be more or less efficient if the appropriate cells are already in the processors cache. By pushing most of the lookup table out of the cache, an attacker can observe whether the appropriate cells are called, leaking information. Can also be observed as a timing attack or by power analysis if the cost of loading the cache can be observed.
Defence: Dont use lookup tables on secret data! The simplest defence in this list - if you don't want to leak information about which lookup entries have been used, don't use lookup tables. In the case of AES this is reasonable, because the AES Sbox can actually be computed as a simple function on the input byte. This would be much less practical with (for example) the DES Sbox, which does not have this structure.






Friday, July 24, 2015

52 Things: Number 42: Look at your C code for Montgomery multiplication above; can you determine where it could leak side channel information?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. In this post (spoiler alert!) we enumerate various flavours of side-channel attacks.

A few months ago (back in March) you may remember I posted for the 23rd of the 52 things in this series (can be found here). The article was entitled “Write a C program to implement Montgomery arithmetic” and contained parts of an implementation to do this. In this article we are going to look at this implementation and see how it could leak information to give us a practical idea of what leakage may look like.

Before progressing however I would like to remind you of my last post which examined the difference between SPA and DPA attacks. You will remember from there that SPA attacks use a single or very few traces and work by spotting trends in the pattern (such as timing or instruction sequences) while DPA attacks use many traces and aim to derive intermediate vales of an algorithm and thus the secret information by using hypotheses of the secret data and a correct leakage model. Before looking at the Montgomery multiplication algorithm then it is worth stating from the outset that if hypotheses of secret data and corresponding leakage models can be derived for the algorithm, DPA style attacks can be used to derive intermediate values which will mean that the algorithm will leak data being processed. If this data is therefore secret, we can already say that this algorithm is going to leak through using DPA style attacks.

As mentioned in my last post however, we know that DPA style attacks are significantly harder than SPA as they require the ability to form hypotheses of secret data, more traces and their success will depend significantly on the noise produced by the device. The question then is how our implementation will leak using SPA attacks where things such as timing or sequences of instructions being executed can be exploited. To understand this lets look at each of the four stages I presented for the algorithm last time separately. Things we are interested in are things such as conditional statements or loops as this may be exploited for timing attacks.

1. The GCD Operation

This section uses the binary extended gcd operation to find the integers r-1 and m’ such that rr-1 = 1 + mm’. If we assume therefore that our algorithm for the extended gcd operation runs in constant time then we can assume that this operation is safe.

 

2. Transform the Multipliers

This stage computes abar = ar mod m and bbar = br mod m and therefore as this is a simple operation, it is unlikely to leak provided the operations and instructions required (such as the multiplier) run in constant time.

 

3. Montgomery Multiplication

This is the main section of the algorithm. The first sub-stage to of the function is to calculate t = abar*bbar and in the same way as the previous two stages we can assume no leakage.

The second stage however is the computation of u = (t + (tm’ mod r)m)/r and it is here that we encounter two conditional statements the first i:

if (ulo < tlo) uhi = uhi + 1;

which allows for a carry as we have a 128 bit implementation on a 64 bit architecture (see article for more information). Observing the time of execution or a power trace will therefore give insight into whether or not this conditional was executed and therefore whether or not these ulo was higher than tlo.

Again we have a second conditional statement which tests if u >= m.

if (ov > 0 || ulo >= m)
ulo = ulo - m;

This will also reveal whether either of these conditions is true and thus leak information about the values being worked on.

4. The Inverse Transformation

Here we compute ur-1 mod m which is the product of a and b modulo m. In the same way as stages 1 and 2 did not leak we can also say that this stage will not leak.

Friday, July 17, 2015

52 Things: Number 41: Are all side-channels related to power analysis?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. In this post (spoiler alert!) we enumerate various flavours of side-channel attacks.

Right, shall we keep this one simpleSide-Channel Attacks (SCA) utilise information acquired in the context of physical implementations of cryptographic algorithms, as opposed to classical cryptanalytic attacks which target theoretical weaknesses. Power analysis is perhaps the most popular flavour of SCA, but it is certainly not the only one.

It would be beyond the scope of this blog to provide a comprehensive list of all side-channels and attack methodologies. Instead, here are some of the most common SCA targets, together with references to simple but clever attacks:

  • Power consumption. The instantaneous power consumption of a device can leak information about the  processed value, e.g. its Hamming weight. I recommend Mangard's attack on the AES key schedule as a starting point for those interested in Simple Power Analysis (SPA). 
  • Electromagnetic radiation. Apparently, it's rather tricky to get the measurements right for this one, but once that's done -- the attack methodology is similar to Power Attacks. Here's the most cited paper dealing with EMR.
  • Other. There is no limit to what can constitute a target for SCA, and that's in part why they are so interesting. Here's some more ideas: 
    • an attack that uses visible light emitted from computer LED (light-emitting diodes),
    • an attack exploiting error messages, also known as padding oracle attack

Writing the above, I realise how it could be unsettling to know or to find out that there are so many loopholes. SCA is very much a cat-and-mouse game, and researchers usually recommend ways to avoid the signaled vulnerabilities.






Monday, July 13, 2015

Parallel Key Enumeration

Now available on eprint is: "How to Enumerate Your Keys Accurately and Efficiently After a Side Channel Attack".

This paper by Bristol Crypto shows how the key ranking and key enumeration problem can be framed as an optimisation problem, the knapsack problem, to allow extremely quick and efficient solving. The paper details a novel graph algorithm which enables the ranking of keys accurately and nearly instantly, and provides best in class performance results for key enumeration. The algorithm can easily be fully parallelised, with excellent scaling results as more processors are used.

For full details, see the associated blog article on the Silent website, here. The parallel key ranking algorithm has been developed into a public API, which is available on the tools page.

Friday, July 10, 2015

52 Things: Number 40: What is normally considered the difference between SPA and DPA?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. We continue with our side-channel track and discuss the differences between a side-channel attack and a fault attack.

Power or Electromagnetic (EM) analysis attacks divide into two types of attacks, Simple Power Analysis (SPA) or Differential Power Analysis (DPA). Both of these types of attack work using either electromagnetic or power traces of a device but vary fundamentally in the number of power of traces they require and how these traces are analysed. Before examining the differences between these attacks, it is worth looking at what a power/EM trace is.

Power traces

The power of CMOS circuits can either be static or dynamic. Static power consumption is the power consumed when the circuit is static (ie. no switching is taking place) and is typically small. Dynamic power consumption is the power consumed by the switching that occurs in the circuit between 0 and 1 or 1 and 0. Dynamic power consumption is typically the greatest contributor to power consumption in a circuit and as it depends on the data being processed by the circuit. Dynamic power consumption comes from two factors. The first is the capacitance charging current and the second is the short-circuit current. Each CMOS cell has a load capacitance connected to the output of the cell. This load capacitance includes the wires that connect the cells to subsequent cells and also the input capacitances of the cells. 

A CMOS cell draws current from the power rail $V_{dd}$ to charge these capacitances which in turn leads to power consumption according to $P = \alpha f C_l V_{dd}^2$, where $\alpha$ is the average number of $0 \rightarrow 1$ transitions which occur every clock cycle. \cite{dpabook}. This charging happens when there is a switch from $0 \rightarrow 1$ at the output. When there is a switch from $1 \rightarrow 0$, the current is drawn from $C_L$ to $gnd$ via the NMOS and not $V_{dd}$. The second part of the contribution to power consumption is the short circuit current. This occurs during a switch when it is moving from $0 \rightarrow 1$ and $1 \rightarrow 0$ and occurs when both the pmos and nmos transistors are conduction at the same time - though this happens very briefly. This power consumption is in accordance with $P_{sc} = \alpha f V_{dd} I_{peak} t_{sc}$, where $I_{peak}$ is the current peak during switching and $t_{sc}$ is the time for which the short circuit exists. [1].

Understanding these two points of dynamic power consumption we can see that all switching $1 \rightarrow 0$ and $0 \rightarrow 1$ will consume power through the short circuit current, however switching from $0 \rightarrow 1$ will consume more power due to the charging of the load capacitance. If we are able to measure the power consumption (or EM field as current of varying strength will produce an EM field of equal variance allowing the measurement of the EM field to give a measurement of power consumption) of a device accurately we can therefore determine the number of switches which will let us look inside the device in two ways. First by allowing us to determine a particular operation (a multiplier for instance may require more switches than an x-or gate for instance) and secondly, and more crucially, the data being operated on by the operation, as this may affect the switching.

SPA and DPA Attacks

The main difference between SPA attacks and DPA attacks is the number of traces required. SPA attacks typically use one or very few traces whereas DPA attacks use many. They also vary in the way they exploit the dynamic power consumption of the device with SPA attacks identifying sequences of operations, however they can also exploit data dependency as in the case of templating attacks for instance. This is illustrated by the well known SPA attack on the square and multiply algorithm for binary expansion in RSA. Here, if the binary value in the exponent is 0, the value is squared and if it is a 1 then the value is squared and multiplied. Viewing this on a single trace it is possible to see the shape of a square operation and the shape of a square and multiply operation and thus read of each bit of the key as a 0 or a 1. The beauty of this attack is that only a single trace is required to make this observation, making it an SPA attack.
DPA attacks on the other hand exploit only the data dependency element of the power consumption by using multiple traces and statistical techniques. They focus on the data dependency of the power consumption and work by creating hypotheses of how much switching (and therefore change in power consumption) there will be for given data. These hypotheses are known as leakage models and are usually hamming weight or hamming distance. If this leakage model is correct, the power traces should reveal information being processed according to it, although in reality this is always combined with noise which distorts the data/power relationship. In DPA attacks, secret data values being operated on can be determined by estimating them and seeing if the representation of them according to a leakage model correlates to a number of different power traces. A DPA attack therefore requires a number of traces – the number can vary between as few as 50 to thousands depending on the level of noise and accuracy of the measurements.
[1] Mangard, Stefan, Elisabeth Oswald, and Thomas Popp. Power analysis attacks: Revealing the secrets of smart cards. Vol. 31. Springer Science & Business Media, 2008.

Thursday, July 9, 2015

52 Things: Number 39: What is the difference between a side-channel attack and a fault attack?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. We continue with our side-channel track and discuss the differences between a side-channel attack and a fault attack.

Side-channel attacks (SCA) are a class of attacks in which an adversary attempts to deduce information about a targets computation by observing side-channel leakage (e.g. timing, power consumption, electromagnetic emanations, acoustic noise etc.).

Fault attacks (FA) are a class of attacks in which an adversary attempts to exploit the result of a faulty computation. The fault may be the result of a program/design bug (e.g. Intel's famous FDIV bug), or a fault induced directly by the adversary (e.g. power glitch, clock glitch, temperature variation, ion-beam injection etc.). 'The Sorcerer’s Apprentice Guide to Fault Attacks' gives a good overview for a few of the more hands-on techniques for fault injection.

To sum up, the primary difference between the two is that SCAs exploit computation leakage and FAs exploit the result of erroneous computation. It's not a very insightful (or interesting) answer so perhaps a better question is to 'Discuss some similar approaches to side-channel and fault attacks'.

I'll discuss examples for three sub-classes for both SCAs and FAs: non-invasive, semi-invasive and invasive. Just as a side note, there may be some disagreement on the exact classification of non-invasive, semi-invasive and invasive attacks. These are my opinions and if you disagree, I invite you to take it up with my publisher.

Non-Invasive:

An attack is classed as non-invasive if the adversary has no physical contact with the target. I give solitary examples below but there are more out there.

SCA: Timing attacks are arguable both the most applicable and the least invasive SCA vector. This can be partly attributed to the fact that timing attacks can be carried out remotely but also because they are easily introduced. Consider a large library such as OpenSSL that supports numerous platforms and a whole suite of crypto applications. Ensuring that all sensitive computation is programmed to be constant time for each individual platform quickly becomes cumbersome. Add to this that you will be fighting the compilers efforts to optimise your code makes it an uphill battle.

FA: Non-invasive fault attacks are somewhat less common as it requires the faulty behaviour to be triggered based on the input. One could possibly build an attack using the aforementioned Intel FDIV bug which will return an incorrect division result for one in every 9 billion random inputs. I can't seem to find any reference to a concrete attack so we'll take it on faith that it may be possible.

Semi-Invasive:

An attack is classed as semi-invasive if the adversary has limited physical contact with the target. I give solitary examples below but there are more out there.

SCA: In a power analysis attack the adversary monitors the power consumption of a the target device during the computation of a secret datum. The theory behind the attack is that the dynamic power consumption will be (in some way) related to the data being processed. If the adversary can approximate this relationship then they may also be able to make deductions about the data being processed. This is arguably semi-invasive as the adversary would require a power tap to the target device which is not always available and so the adversary would need to modify the target.

FA: Clock and power glitch attacks aim to induce faulty behaviour in a target device by altering the target environment. Consider the clock input to a chip, this clock governs the speed at which the target device operates. The maximum clock rate of a device is bounded by the 'critical path' of the target which is the longest amount of time required for a combinatorial circuit to reach a stable state. Violation of this bound will trigger race conditions in the device and hence result in undefined behaviour. This makes glitch attacks somewhat unpredictable and sometimes hard to reproduce but incredibly effective!

Invasive:

An attack is classed as invasive if the adversary has unlimited resources and access the target.

SCA: Probing SCAs use a direct tap on the data bus of a target device allowing an adversary to (almost directly) read off any information that goes across the bus. The target must be fully decapsulated and carefully examined to precisely target the secret data.

FA: Probing FAs allow an adversary to completely change the behaviour of the target device. Once again, the target must be fully decapsulated and mapped out to accurately influence the behaviour but once this is done, adversary has the ability to re-wire the target and consequently alter its operation.

Both SCA and FA have been demonstrated on real-world devices to devastating effect. There are, of course, several countermeasures devised to mitigate for these attacks but these will come up later in our '52 things' series.

Monday, July 6, 2015

Call for participation: Summer School on Secure and Trustworthy Computing

As part of the PRACTICE project our group organizes together with TU Darmstadt a Summer School on Secure and Trustworthy Computing in Bucharest, Romania (23-27 September).  The school will cover a broad range of topics from hardware and cryptographic underpinnings to practical deployment in cloud scenarios.

The school will benefit from a large number of international experts and Bucharest is a fun (and cheap) city to visit.

More information on how to apply, deadlines, etc. can be found on the school webpage:

                                    http://summerschool.trust.cased.de