Monday, October 5, 2015

Study Group: Attacking PKCS#11 Tokens

This year, the Bristol Cryptography Group is doing something slightly different with our weekly study groups. Each member of the group has been tasked with talking about "the best paper [they've] read this year".

This means that papers we're presenting won't all be particularly recent, but (hopefully) interesting and exciting bits of work.

I kicked things off last week by talking about Attacking and Fixing PKCS#11 Tokens [1], which is from CCS '10. I started my PhD just over a year ago and this was one of the first papers I read. It has stuck with me as a model of good scientific writing: the exposition is very clear, the content is cleanly presented and one doesn't need to have a huge amount of prerequisite knowledge in order to follow the methodology. More importantly, even a non-specialist reader can understand the main result of the paper and appreciate its significance.

In brief: the authors used a formal, symbolic model to analyse a number of commercial security devices implementing a standard key management API called PKCS#11. Of the 17 tokens tested, "9 were vulnerable to attack, while the other 8 had severely restricted functionality." As I said, one doesn't need to have a PhD in Cryptography in order to get the point: real security devices were seriously flawed. In fact, they allowed you to obtain secret keys in plaintext.

A security token is designed to provide access to sensitive material in insecure environments. Your company might want you to access their (sensitive) system from your (potentially insecure) house, so they give you a USB stick (or similar) to bridge the gap. The token will do cryptography for you and your personal laptop, which could be infested with malware, won't be able to directly access the secret keys stored on the token. Instead, there's an API - an interface - between your machine and the token. Moreover, the token is (supposedly) tamper-resistant, so one cannot learn the key values by breaking open the hardware.

The most popular API for security tokens is PKCS#11, which is described in a vast document that was first published in the mid '90s and has been updated very little since then. The trouble is, while the document itself is public [2], the implementation of its contents within each security token is typically unknown to all but the token's manufacturer. So, while attacks on the standard had been known for quite some time prior to the publication of this paper [3, 4], it was unclear whether these attacks could actually be mounted on real devices. Perhaps the manufacturers had been smarter than those who had written the standard and had avoided the many pitfalls by adding extra safeguards?

Of course not. Compliance trumps security every time.

The authors of the paper built a tool that reverse-engineers a token to deduce its functionality, builds a formal model to represent this functionality, finds an attack in the model and then attempts to implement this attack on the real device. While the formal model is quite sophisticated, the attacks are not. Of the five presented in the paper, three are obvious non-compliance with the standard (such as the token just handing you a secret key in plaintext if you ask for it!) and the other two are versions of the famous Wrap/Decrypt attack first found in 2003 [3]. The only tokens that were not vulnerable to these basic attacks were those that disabled key wrapping (encrypting a key under another key) altogether, but this is an important feature of key management APIs.

It is, in fact, possible to do key wrapping in a secure way, but it requires great care. There has been some research in this direction since 2010, including my own ongoing work, but I don't have space to elaborate on this here.

For me, the take-home message from this paper is that, if you don't prove the security of your security device, it's probably insecure. This insecurity makes for fun academic papers, but is also rather worrying when your device is widely used in the real world.

[1] -
[2] -
[3] -
[4] -

Wednesday, September 30, 2015

Challenges in Cloud Storage

This post will discuss parts of the talk by Angelo De Caro (IBM Zurich) on 'data storage-oriented cryptographic protocols,' given at the workshop on secure and trustworthy computing in Bucharest.

The cloud provides an attractive opportunity for users and enterprises (collectively 'clients') to outsource their storage. Many providers offer low-cost storage, with straightforward management and ubiquitous access (from multiple devices). However users inherently lose direct control, meaning they have no guarantees about privacy or the future availability of their files.

Deduplication is the process by which a provider saves itself storage space (and consequently money) by only storing one copy of each file, and using some method of tracking which users own that file. This concept is so desirable because there is a large amount of redundancy in many contexts, such as media (movies, music etc.), system files/software and email attachments. When server providers want to deduplicate they have a number of choices:
- file-level or block-level (latter allows better dedup in systems where updates to large files are common)
- single-user/cross-user (latter is more desirable for providers)
- client-side/server-side

The final point introduces some interesting concerns. Server-side dedup means that the client needs to send the whole file to the server, meaning a high bandwidth cost. For client-side deuduplication Alice takes a hash of her file $F$ and sends this value $H(F)$ to the server: if the server hasn't seen this before then instructs Alice to upload the file, if not then deduplication occurs. This means that having $H(F)$ is enough to download the file (DropBox previously used a system where this was the case). This means Alice can send all her friends $H(F)$ where $F$ is a movie, meaning that the provider acts as a content distribution network (this is not a privacy problem: server doesn't want to become such a provider and will have to pay for bandwidth etc). This method also creates a covert channel that reveals which files are stored on the server: in the so-called 'salary attack' discussed by Pinkas et al. an adversary Eve has knowledge of what a company's payslips look like, so can learn an employee's salary by creating a large number of possible payslips and beginning this upload process for each one until the server informs Eve that it already has the file.

Proofs of Ownership
These issues mean we'd rather use proofs of ownership (PoWs)--a way for Alice to convince the server that she owns the file--and this means we need to avoid short file identifiers. In 2011 Halevi et al. suggested the following framework for such a paradigm: in the preprocessing phase the server stores some short info per file (file itself is located in some secondary storage), then in the proof phase (done only during file upload) there is some challenge/response mechanism. This procedure needs to be bandwidth-efficient, and computation (particularly for the client) should be efficient. The authors suggested a method using Merkle trees with special encodings, where the prover is 'challenged' on a certain block in the hash tree. In the preprocessing phase a server is sent the file and computes a Merkle tree, then stores the root. To prove, the server asks the client to present sibling paths to $t$ random leaves, the client computes them, server authenticates. This solution is bandwidth efficient, and space efficient at server side, but the client has to do quite a lot of computation. A client that knows a large proportion of the file will likely be able to 'cheat,' so we need a way to 'spread' the entropy across the file to stop the scenario like the salary attack. Using an erasure code is a good way of doing this (cheating probability $2^{-t}$) but constructions are fairly computationally expensive. An alternative approach was taken by Di Pietro and Sorniotti (AsiaCCS 2012), which is considerably more efficient at on the client side but worse on the server side, and the challenge values have to be recomputed when they are exhausted.

Proofs of Retrievability
Alice outsources a file and wants to know that it's retrievable, meaning not only that the server still holds the file but also that the server hasn't modified the file. There is a trivial solution: just download the file on a regular basis. A better approach is to use a keyed hash function and store $H(k,F)$; if Alice wants to verify she can just send $k$ to the server S and ask S to compute $H(k,F)$ and compare. This is storage efficient for Alice, but S needs to read the entire file and can only verify once. Even better is to use 'sentinels' (short, random strings): Alice embeds sentinels in random positions in $F$, encrypts block-wise, and sends this file $F'$ to server and keeps the sentinels. To verify Alice just asks for sentinel $s_i$ and checks if it is correct. Protocol means Alice doesn't need to store all of F and can detect large erasure (which would remove more than one sentinel), but Alice has to store the sentinels and cannot detect small erasure. One can improve this by computing $MAC(k,s_i)$ and appends this value to the file (server doesn't know to which sentinels these MACs correspond), only needing to store $k$. Alice doesn't need to store any of F and can detect large erasure but still can't detect small erasure. We can solve the 'small erasure' problem by using error-correcting code, however this makes it more expensive.

Both PoWs and PoRs are tangential to the goal of confidentiality of files from an untrusted server. If two clients upload the same file, encrypted under their own keys, then we'd expect these two ciphertexts to be distinct and (assuming a strong method of encryption) the server shouldn't be able to learn that the two ciphertexts correspond to the same file. Douceur et al. gave an initial solution to this problem: hash the file and use this value H(F) as the encryption key, and this was generalised by Bellare et al. (Eurocrypt 2013). Since encryption is deterministic we can only expect some sort of security if files have high entropy, and indeed this approach allows offline brute-force attacks (Eve is sent a challenge ciphertext $C^*$, and if the message space is small Eve just computes hash of each file and creates ciphertexts until she finds a collision with $C^*$). The same authors of the EC13 paper gave a solution to this problem: their system called DupLESS uses an independent key server (KS), and a user engages in an oblivious PRF with KS to get an encryption key (this means that this key server needs to enforce a per-client rate-limiting strategy to stop brute force attacks). At CCS next month Liu, Asokan and Pinkas will present a paper that removes the need for the key server, by distributing its role among the clients using PAKE (Asokan gave a talk on this paper earlier in the workshop).

This presentation complemented Asokan's talk, Florian Kerschbaum's talk about computing on encrypted data (slides here) and the talk given by Marc Lacoste from Orange Labs who discussed the goals of the Super Cloud project and the security challenges involved in the development of 5G communication standards and IoT.

Monday, September 28, 2015

52 Things: Number 49: Describe the basic ideas behind IPSec and TLS.

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. This week we discuss the basic ideas behind IPSec and TLS.

Internet Protocol Security (IPsec) and Transport Layer Security (TLS) both aim to create a secure communication channel between two parties over an insecure network. In general, both use some mechanism to establish a private session key (either pre-shared or via a key negotiation protocol) and use symmetric key cryptography for the bulk of the communication. There are some further details with regards to authentication but I'll skip over that. Although these two ultimately have similar goals, they differ considerably in their implementation.

IPSec sits on the network layer of the OSI model and aims to provide integrity, authenticity and confidentiality between two end points. As it sits on network layer, it blindly encrypts, MACs and packages up the data from the above layers before sending it down the line. This effectively creates a virtual network link between the two end-points without the need to ensure the end-point application has secured the data appropriately. This is often deployed for enterprise VPN solutions as it is a fast solution for remote access to an enterprise network. The downside however is that once a connection is up, it is tricky to restrict applications from using the connection once it is up.

TLS on the other hand establishes a secure connection at the application layer of the OSI model. We see TLS heavily used for securing web protocols such as HTTPS, STARTTLS etc. and as a consequence, each connection/application will negotiate/set up a secure connection independently. From a security perspective, this is quite attractive as a single compromised channel *should* have no bearing on the remaining channels. Whilst TLS can be viewed as a more flexible approach, it does incur some overhead over IPSec for a large number of connections between two nodes.

It's easy to get into very fine details but I think that should cover the 'basic' ideas of the two.

Saturday, September 26, 2015

52 Things: Number 48: What is the purpose and use of a TPM?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year.

Before examining the point of this question (namely what the purpose and use of a TPM is) it's worth trying to understand the problem a TPM is designed to overcome. The problem is really one of trust. Trusting what? Well, primarily the memory and software running on a computer. These things can be directly accessed by the operating system and so secret information (such as cryptographic keys) can be accessed by an attacker who has access to the machine at the operating system level. If these keys are being stored directly in memory and being accessed by software, it could be fairly trivial for an attacker to read off the memory location where the keys are being stored and then compromise security.

One way around this problem is make sure that keys are never stored directly in the computers memory which can be accessed by software. Given that the keys are required for secure applications they must at some point be presented in a state that can be used by the software so how could this be possible? Well, one way is to protect the secret keys stored in memory by wrapping them using a key that the software does not have access to. By having a separate piece of hardware for instance that has a key burned into it and which is able to perform certain cryptographic operations with that key. This piece of hardware could therefore be employed by the software to do various things with this secret key that is stored on the hardware to do things such as wrap keys to be stored in memory, but never have access to this key directly.

This is essentially what a TPM does. A TPM has an RSA key pair called the Storage Root Key (SRK). The private part of this key is kept secret from everything and everyone. Using this private key, other keys (that software uses) can be wrapped (often called “binding”) using the SRK, protecting them from disclosure. In addition to simply wrapping keys, TPMs can also wrap keys and tie them to certain platform measurements. This type of key can only be unwrapped when those platform measurements have the same values that they had when the key was created. This process is known as “sealing.” TPMs can also be used for cryptographic key generation and perform other cryptographic tasks one of which is know as remote attestation, which creates a hash key summary of the hardware and software configuration allowing a third party to verify that the software has not been changed.

The real point to understand here is that by pushing security down to the hardware level and ensuring that it is given over to a separate piece of hardware that has it's own firmware and circuits that can't be altered from the outside, the system is not exposed to software vulnerabilities and is therefore more trustworthy.

So what is the purpose of a TPM? To overcome the problem of trusting (or rather not trusting) software to be completely reliable.

What is the use of a TPM? We mentioned a number of them. First of all was binding, which essentially wraps a key using the private key of the SRK. The second was sealing which also ties the wraped key to a particular platform measurements. And thirdly we looked at remote attestation and noted that TPMs can also be used for other cryptographic functions such as key generation.

Distance bounding protocols and physical layer security

Yesterday at the Summer School on Secure and Trustworthy Computing Srdjan Capkun gave an interesting talk on physical layer security and distance bounding protocols. I'll try to give some insights of why I found this talk particularly interesting.

Bounding the distance between two devices is becoming an increasingly important problem. Think of you contactless card, your car key. Crypto alone can not answer this questions. Indeed, this is a property of the underlying physical space, not of abstract data/identities. To solve that question, distance bounding protocols have been using time of flight measurements (mixed with some more classical crypto). The intuition is that if you know that the signal can travel from A to B in time less than t, then A and B are at distance at most t*speed of light. The talk wasn't so much about how to study formally such protocols (see [BCSS11] for example of such an analysis), but on the physical properties of the underlying systems needed for these protocols to actually work as intended.

The main assumption is that an adversary can not transmit information faster than the speed of light. This looks like quite a reasonable assumption, as the theory of relativity seems to be a hard problem to break 1. However, as the talk made quite clear, the situation is more complex. Outputing a bit on a channel is not an instantaneous operation, and time matters a lot here. A nano-light-second is about 15cm, and usually transmitting a bit takes micro/milli seconds...  This can practically be used in attacks where the adversary guesses the bit you're trying to send based on only part of the transmission. As a consequence it becomes essential for the length of a bit in that setting to be as low as possible. The technical details were somewhat lost on me, but, according to the speaker, it is possible to reduce this time to a few nano-seconds. This reduces the time of flight uncertainty to less than a meter[under the assumption that FTL travel is not possible], which is pretty good.

The second unusual problem tackled by this talk is as follows: assume that the machine that wants to prove its proximity to you is adversarial, can we still bound the distance? Distance measurement between A and B is done by A sending some signal to B and B answering a processed version of this signal back to A. A then computes the distance: (total time - processing time)*c/2. An adversarial machine can not cheat on the time of flight part, but it can cheat on the processing time. In the end, not only do we need short bits, we also need extremely short processing time. This means that for this approach to become practical, we need to build systems that receive, process and send signal in the span of nano-seconds. This talk provided with an example of such extremely efficient, completely analog computing/transmiting node. Interestingly this also entails that the kind of computations you can do in that framework is quite limited, making the problem extra interesting.

The talk was concluded by a fun use case: secure positioning. In a world where drones and self driving cars are getting more and more comon, the ability for someone to make your system believe it's in the wrong location, this might well become a real problem. GPS positioning is far from secure (see, and there is little hope that any similar non-interactive system will ever yield the integrity guarantees we need. This nicely justifies the need for secure distance bounding protocols with the appropriate architecture.

All in all, after this talk my impression is that the interactions between the physical layer and the protocol layer might very well be the key for future developements of secure distance bounding.

1 I guess that if you break it, distance bounding will anyway be the least of your problems

Thursday, September 24, 2015

Intel SGX: The Death of MPC?

An earlier version of post contained errors about the signing of binaries. Thanks to Guillaume Scerri for pointing them out.

On the first day of the summer school on secure and trustworthy computing, considerable time was dedicated to speakers from Intel talking about security-related CPU features. SGX is an upcoming such feature. Even though the introduction seemed to be more directed towards programmers rather than security researchers or even cryptographers, I was left with the following insights.

The core idea of SGX is to obscure the memory of an application from the operating system. Motivated by possible flaws in the OS, the applications themselves are separated from the OS. This is achieved by encrypting the application's memory using a key generated by the CPU, which never leaves the CPU. It is unclear to me whether SGX will hide memory access patterns of the application. However, this probably can be achieved using oblivious RAM within the application. Furthermore, I/O will be still be controlled by the OS, thus leaving the possibility of keylogging etc.

In order to convince a remote party that the binary is indeed running on SGX, processors will contain a private key to sign the binary being executed, with the corresponding public key being provided by the manufacturer. It is not clear whether the private key will be global or per CPU. A private key in the CPU would make it easy to also have the binaries encrypted, but it seems that this is not planned at the moment.

With SGX, Intel obviously targets cloud applications in an attempt to restrict the cloud provider's access to data in the cloud. Since there are no processors with SGX yet, it is hard to estimate the cost and the feasibility for SGX on mobile processors. However, if SGX eventually comes to mobile processors, rooting or jail-breaking becomes somewhat useless because the power of the OS will be restricted. This could be seen as another step of an evolution that went from proprietary software computing locally on personal data to personal data being held in the cloud. With SGX available on all processors, an app provider could exercise almost complete control over all cloud and mobile instances of the app, given the trustworthiness of the processor manufacturer. As a result, apps might effectively become black boxes with all interfaces defined by the app provider.

The second talk on SGX suggested its use as a replacement for multiparty computation as follows: The parties agree on a piece of software executing the computation. The software is then run using SGX and generates a public key, which the parties use to encrypt their inputs. After running the computation, the software only outputs the previously agreed results to the parties. One can argue that the security should be similar to the security of MPC because one has to trust the processor to execute the local computation correctly and without leakage to the adversary. Therefore, one might as well trust the processor with SGX to execute the software correctly while preserving confidentiality by encrypting the memory. However, this scheme introduces the CPU manufacturer as a third party possibly knowing the private key, which is used to confirm that the software runs on SGX. The manufacturer is inherently trusted not to collude with any party. More concisely, one might ask: You trust your CPU. Do you also trust the manufacturer?

Tuesday, September 15, 2015

CHES 2015: Evaluation and Improvement of Generic-Emulating DPA Attacks

On Tuesday afternoon at the ‘Palais Grand Large’ (or, as my long-unrehearsed school-level French childishly insists on translating it, the ‘Big Fat Palace’), Weijia Wang presented ‘Evaluation and Improvement of Generic-Emulating DPA Attacks’. This paper (joint work with Yu Yu, Junrong Liu, Zheng Guo, François-Xavier Standaert, Dawu Gu, Sen Xu and Rong Fu) is of particular interest to me as it builds on work done in part at Bristol and presented at CT-RSA 2014 ([WOS2014]; ePrint).

‘Generic DPA’ attacks -- methods which succeed against an arbitrary physical implementation without the need to meaningfully characterise the power model -- have been hotly pursued in the side-channel literature. However, it has been shown ([WOS2014]) that all such proposals rely on noninjectivity of the target function (among other properties) to succeed. Attacks against injective targets necessarily fail unless provided with some compressing mapping corresponding meaningfully to something about the true leakage of the specific device. 

We introduced the notion of ‘generic-emulating DPA’ to enable key recovery attacks against injective functions such as the AES S-Box requiring only some minimal intuition about the form of the leakage. The instantiation we presented used stepwise linear regression to find the key hypothesis producing the most ‘simply explained’ model for the measured leakages (see [WOS2014] for details).

Wang et al. experiment further with our proof-of-concept example and show that the stepwise approach is unstable in small samples. They suggest two alternatives -- ridge-based and lasso-based regression -- which perform better in practice. Against low degree leakage functions, the best difference-of-means (DoM) attacks outperform all three tested generic-emulating methods. However, against complex leakages (for example, a polynomial in eight target bits comprised only of terms of degree 5 and above) their proposals demonstrate an advantage.

They also incorporate cross-validation into their strategies, which is shown to further improve outcomes so that even against typical smartcard leakages the ridge-based and lasso-based approaches begin to rival the best DoM attacks. The question and answer session drew attention to the computational cost of such methods relative to (cheap) DoM, but the authors were keen to stress that this was not prohibitive. Whilst no thoroughly convincing application scenario has yet been discovered for generic-emulating DPA, it is interesting to see further progress from our basic suggestion into more practical territory.

[WOS2014]: The Myth of Generic DPA … and the Magic of Learning, Carolyn Whitnall, Elisabeth Oswald, François-Xavier Standaert, CT-RSA 2014, vol 8366 LNCS pp 183-205.