Wednesday, February 27, 2013

RSA Conference 2013 - Cryptographer's Panel

The RSA Conference is being held this week in San Francisco. This morning there was several keynote talks as well as the Cryptographer's Panel. This year's panel consisted of Whit Diffie, Ron Rivest, Adi Shamir and Dan Boneh, and was chaired by Ari Juels.

Ari began the discussion with a retrospective question regarding how at the advent of modern cryptography there were restrictions on the export of Cryptography, moving on to the present day where there is now a great deal more openness in the Crypto community. All panelists were fully in support of this shift. Dan Boneh in particular highlighted the importance of education and how this can only be good for ensuring our systems are secure. The online crypto course offered by Stanford has had over 150000 students worldwide since it began. Students have even been able to participate from some of the remotest parts of the world.

The second topic of discussion was what each panelist felt was the most significant attack seen in the past year. Here the focus moved to Certification Authorities and the recent problems with these. As a result of this discussion, Ari Juels asked if the importance of Cryptography was diminishing, to which Adi Shamir agreed. He then gave an example focusing on the case where even the most isolated systems have now suffered from attacks. Shamir felt that we need to rethink how we protect ourselves and assume threats are already within our systems. Ron Rivest stressed that he felt that crypto was still essential. As a conclusion Dan Boneh gave what he described as the "killer argument". Here he referred to work on the security analysis of medical equipment such as pacemakers, where cryptography and security really does become something of a life or death matter.

The panelists were than asked what they are working on now and is it crypto? Whit Diffie stated that he always tries to work on as much crypto as he can.  Ron Rivest's two main areas of interest are currently Secure Voting and creating more robust and flexible PKIs. Adi Shamir's focus is currently on the analysis of the SHA-3 algorithm, KECCAK. Up until one year ago the best attack was on only 2 rounds of KECCAK. Soon in joint work with Orr Dunkelman and Itai Dinur, Shamir will publish an attack on 5 rounds but there is still a significant way to go to a full attack since KECCAK has 24 rounds in total. Shamir stated that he felt KECCAK was a good choice for SHA-3 as it has a solid design and was reasonably fast. Finally, Dan Boneh described two of his most recent works, looking at leakage from smartphone accelerometers and an efficient scheme for key rotation within encrypted cloud storage.

The final topic of discussion was on the importance of post-quantum cryptography, to which Whit Diffie's immediate riposte was "I think we should wait for the Physicist's Panel". The general feeling of the panel was that we don't really know what is possible yet in terms of our ability to build a quantum computer. Dan Boneh feels that to be safe it is only logical to start implementing quantum resistant schemes now. With the update cycle of algorithms taking 10-20 years this would mean that if need be quantum-resistant algorithms would be ready to use if a quantum computer were built. Adi Shamir argued that unlike traditional attacks we will see quantum computers appear very gradually and we should be just as worried about new kinds of attack. In response Dan Boneh said this gave further basis to have more diversity in our algorithm choices, highlighting that there are still only two families of public-key algorithms: RSA-based and Diffie-Hellman-based. The crypto community has developed many more families but these are not yet implemented. Whit Diffie stressed caution in switching to these new families as if one fails then this would make people wary of the others. Finally, Dan Boneh described the recent advances in solving the discrete log problem by Antoine Joux (see our recent blog) showing that there is further reason to start implementing other families of algorithms as a fall back.

Friday, February 22, 2013

Discrete Logarithms

Two results have appeared in the last week in the field of discrete logarithms for finite fields of small characteristic which are quite surprising.  Before discussing the results let me first just set the mathematical scene.

A finite field is given by a prime number p, called the characteristic, and a positive integer n. The finite field is then the set of all polynomials with coefficients in the set {0,...,p-1} of degree strictly less than n. We can add elements in the set by just adding them as polynomials and reducing the coefficients of the resulting polynomial modulo p, so they like back in the range {0,...,p-1}. Multiplication is a bit more tricky, we need to find an irreducible polynomial f of degree n modulo p; then multiplication is given by multiplication modulo f and p. It turns out that the choice of f does not really matter, so we can think of the field as being given by p and n.

So thats finite fields in a paragraph; now what about discrete logarithms? To define a discrete logarithm one picks an element g in the field and then one picks a secret random integer x and one computes 
in the field. The discrete logarithm problem is given g and h, find x.

So how hard is this problem? For many years the best known algorithm has either been the Number Field Sieve (for large p and small n), or the Function Field Sieve (for small p and large n). If we let 
Then the important measure of difficulty is the so-called L function, which is roughly
L(a) = O(exp((log q)a (log loq q)1-a))
To get a feel for L, we note that L(1) is the run time of a fully exponential algorithm and L(0) is the run time of a polynomial time algorithm. When public key cryptography started in the mid 1970's the run time of both the best factoring and the best discrete logarithm algorithm was essentially L(1), although this was quickly reduced to L(1/2) for both problems with the advent of the quadratic sieve. For factoring we replace q by N (the number to be factored) in the expression above.

The next big advance was the invention of the number field sieve which showed that factoring and discrete logarithms (for large p), could be solved in time L(1/3). Around the same time the function field sieve was invented which could solve discrete logarithms for small characteristic finite fields in time L(1/3) as well. Notice, how factoring and discrete logarithms are linked? An advance in one field usually leads to an advance in another and the two complexities are roughly always "in step".

So for the last 20 years we have picked key sizes for factoring and discrete logarithm systems based on the fact that the fastest algorithm runs in time L(1/3). Although for small characteristic finite fields were always considered easier, the extra speed essentially came from the hidden constants rather than the argument of the L function.

But in the last week we have seen two results. The first announcement by Gologiu, Granger (an ex-PhD student from Bristol), McGuire and Zumbragel has demonstrated that one can find discrete logarithms in the finite field of 21971 in practice. Note, this is over twice the bit-length of the best record for factoring or discrete logarithms in large characteristic fields.  The second announcement was by Antoine Joux of a (related) method to solve discrete logarithms in small characteristic fields which runs in time L(1/4).

So apart from being two amazing breakthroughs in implementation and analysis of algorithms, what does this really mean for cryptography?

Firstly, the discrete logarithm problem in small finite fields should not be used for cryptography. We kind of new this already. So nothing surprising there.

Secondly, the two announcements focus on the case of p=2, but the results should pass over to p=3 quite easily. This should signal the death knell for pairing based cryptography on Type-1 curves in small characteristic. Whilst such curves have not been used in commercial systems for a long time, they have been the subject of many academic papers. Hopefully, the above announcements will force the academics working on this topic to move to more useful areas of research.

Thirdly, and this is pure speculation. History has shown that an advance in discrete logarithms is often followed by an advance in factoring. There is no reason to expect such an advance to follow this time; but then again there is no reason not to expect it either. If we can find an algorithm for factoring which runs in time L(1/4) then this will mean that many of the systems we rely on for security will need to be upgraded. This upgrading, to systems which do not depend on factoring, is already being conducted by many organizations.

If we cannot rely on systems based on factoring large numbers, what can we rely on? It turns out that we have known the answer for around 20 years already, the answer lies in elliptic curves. If you want to find out more about elliptic curves then see the two classic text books in the area. Elliptic curves are already deployed in a number of places; in your phone, in the Playstation and XBox, in your browser and many other places.

Tuesday, February 12, 2013

Steganography detection schemes

This week's study group on steganography detection schemes was given by Panos Andriotis and Georgios Oikonomou. While in the physical world it is fairly easy to hide messages beneath or within other, less compromising messages (the ancient Greeks were fairly creative in finding such hiding places), in the digital world it is not quite that simple as you can't simply put one layer of bits onto another. One common trick is to manipulate image data, for example in JPEG files: They encode colours in YCbCr format where Y represents the luma component,  Cb is the chroma difference for blue and  YCbCr is the chroma difference for red. However, the human eye is better at detecting small differences in the luma component than in the two chroma component so steganography libraries such as JPHS, OutGuess and VSL use these components to embed data into images.

At, SPIE'07, Fu, Shi and Su presented in their paper "A Generalized Benford's Law for JPEG Coefficients and Its Applications in Image Forensics" [FSS07] a method to detect steganography in black and white JPEG images based on Benford's Law. For many sets of numbers gathered in real life, for example the height of buildings,  Benford's law gives the probability of the value of the first digit. For example, the first digit is, for many of those number sets, far more likely to be a 1 than a 2. This can be used as statistical check for applications as different as accounting fraud detection, election manipulation detection and - thanks to Fu, Shi and Su - steganography detection as well.

But first a little more on the JPEG compression algorithm: After the original RGB data is (loss-free) translated into YCbCr data, a discrete cosine transform (DCT, loss-free) is applied to each 8-by-8 pixel block, resulting in DCT coefficients. The DCT coefficients are then quantized (this is lossy, i.e. irreversible) before a further loss-free Huffman encoding is applied. In [FSS07] it was shown that Benford's law applies to the DCT coefficients of normal black-and-white pictures as well while pictures that had steganography applied to them follow different probability distributions. For the quantized DCT coefficients however, a generalized version of Benford's law was needed and, depending on the quality factor of the quantization, suitable parameters for the generalized law are given in [FSS07].

Panos, Georgios and Theo were now able to show in a recently accepted paper of theirs that this also holds for colour JPEGs and developed tools which can very efficiently detect potential steganography containing JPEG files with high accuracy (i.e. a low rate of false-negatives) and at high throughput; this is useful to limit the number of files that more precise but considerably slower machine-learning based tests have to analyse. (Outguess, VSL and JPHS were used to apply steganography to the images and the success rates vary; JPHS fared better than the other two.)

P.S. This was actually the study group of January 29th but I must have saved the blog entry instead of publishing it. Sorry for the delay...

Thursday, February 7, 2013

Study Group: Secret Sharing and Access Structures

The study group presented by Enrique and Emmanuela concerned two papers about secret sharing. For the uninitiated, secret sharing schemes ensure that a secret value is distributed into shares among a set of n users, and only authorised sets of users can reconstruct the secret from their shares, whereas any forbidden set cannot obtain any information at all about the secret.

The first paper by Farràs et al. Linear threshold multisecret sharing schemes (ICITS '09) investigates linear threshold multisecret sharing schemes, and multithreshold access structures. In a multisecret sharing scheme, many secret values are distributed among a set of users, and there may be a different access structure for each secret. In a multithreshold access structure, for every subset P of k users, there is a secret key that can only be computed when at least t of them combine their secret information. Qualified sets for the secret associated to P are those with at least t users in P, while every set with at most w users with less than t of them in P is a forbidden set. Schemes are referred to as (w,t,k,n) schemes to reflect these definitions. The case k=n corresponds to threshold secret sharing schemes introduced independently by Shamir and Blakley.
Using linear algebraic techniques, the paper gives a new general lower bound for the randomness of threshold multisecret sharing schemes, please refer to the paper for full details of the bounds and proofs. Building on previous work, the authors present a linear construction of optimal threshold multisecret scharing schemes for the case t=2, k=3 and 1 ≤ w ≤ n-2. They also present a general construction of multithreshold schemes for all possible values of the system parameters (w,t,k,n), which in general are not optimal.

The second paper is by Beimel et al. Secret Sharing Schemes for Very Dense Graphs (CRYPTO '12), which considers access structures based on graphs. If the users are represented by vertices of a graph G, a secret sharing scheme realises a graph if a subset of participants can compute the secret if they have an edge in G, otherwise they can get no information about the secret. Dense graphs are graphs that have - e edges for e « n2 (where n is number of vertices).
Every graph access structure can be realized by a (linear) scheme in which the total share size is O(n2 / log n). The best lower bound for the total share size required to realise a graph access structure by a general secret sharing scheme is Ω(n log n), and for linear schemes this is Ω(n3/2). "Hard" graphs require total share size of Ω(n2 / polylog n), and the paper shows that it is not possible for "hard" graphs to be very dense. The main result is that if a graph has - n1+β edges for some 0 ≤ β ≤ 1, then it can be realized by a secret sharing scheme in which the total share size is Õ(n5/4 + 3β/4) (where the Õ notation ignores polylog factors), and this scheme is linear. If β is a constant smaller than 1, the total share size is « n2, and as a result these are not "hard" graphs. If β < 1/3 then the share size is o(n3/2), and as a result these graphs are easier than the (sparse) graphs used to prove lower bounds in previous work.
Other previous work showed that a connected graph has an ideal scheme (where the total share size is n times the size of the secret) iff the graph is a complete multipartite graph. In order to construct schemes realising very dense graphs, the authors first realize (give shares) and remove certain subsets of the vertices using stars and bipartite subgraphs, until the remaining vertices can be covered by a sequence of subgraphs, G1, ..., Gr, where each is a disjoint union of complete multipartite graphs (cliques specifically). For every i the secret is shared in Gi using ideal secret sharing schemes (Shamir) in each clique of Gi. The techniques used in the paper require considerable introduction, so the interested reader is referred to the full papers for details.