Friday, January 23, 2015

52 Things: Number 16: Describe the key generation, signature and verification algorithms for DSA, Schnorr and RSA-FDH.

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. In this week, we describe the key generation, signing and verification algorithms of DSA, Schnorr and RSA-FDH.


1. DSA
The Digital Signature Scheme (DSA), also known as the Digital Signature Standard (DSS), was proposed by the National Institute of Standards and Technology (NIST) in 1991 [1]. Security of DSA is based on the difficult of computing discrete logarithms. But there is no known proof of its security under a standard assumption (like DL), even in the random oracle model

Domain Parameter Generation
  1. Select a prime number $p$, where $2^{L-1}<p<2^L$ and $L$ is a multiple of 64 and $512 \leq L \leq 1024$.
  2. Select a prime divisor $q$ of $p-1$, where $2^{159}<q<2^{160}$.
  3. Compute a generator $g$ of the subgroup of order $q$: choose a random integer $r$, where $1<r<p-1$ such that $g=r^{(p-1)/q} \ mod \ p$ and $g \neq 1$.
Key Generation
  1. Select a random integer $x$, where $0<x<q$.
  2. Compute $y = g^x \ mod \ p$.
Then the public key is $y$ and the private key is $x$. 

Signing
  1. Select a random integer $k$, where $0<k<q$.
  2. Compute $r = (g^k \ mod \ p)\  mod \ q$.
  3. Compute $s = (h(m)+x\cdot r)\cdot k^{-1}\ mod \ q$, where $h(m)$ is hash of $m$ using SHA-1
The signature on $m$ is the pair $(r, s)$.

Verification
  1. Compute $u_1 = h(m) \cdot s^{-1}\ mod \ q$.
  2. Compute $u_2 = r \cdot s^{-1}\ mod \ q$.
  3. Compute $v = (g^{u_1} \cdot y^{u_2}\ mod \ p)\ mod \ q$.
  4. Output $1$ if $v = r$, otherwise output $0$.
Correctness
If $(r,s)$ is a valid signature on $m$, then we have
$v = g^{u_1}\cdot y^{u_2}=g^{h(m)\cdot (h(m)+x\cdot r)^{-1}\cdot k}\cdot {g^{x \cdot r \cdot (h(m)+x\cdot r)^{-1}\cdot k}} \ mod \ p$
$= g^{(h(m)+x\cdot r)\cdot (h(m)+x \cdot r)^{-1}\cdot k}\ mod \ p$
$= g^k \ mod \ p$
Thus the verification succeeds. 

2. Schnorr
The Schnorr signature is an important DLP-based signature scheme. It works in any prime order group and its security is proven in the random oracle model under DL assumption [2]. 

Domain Parameter Generation
  1. Select a prime number $p$.
  2. Select a prime divisor $q$ of $p-1$.
  3. Select a generator $g$ of the subgroup of order $q$.
Key Generation
  1. Select a random integer $x$, where $0<x<q$.
  2. Compute $y=g^x \ mod \ p$.
The public key is $y$ and the private key is $x$.

Signing
  1. Select a random integer $k$, where $0<x<q$.
  2. Compute $a = g^k \ mod \ p$.
  3. Compute $r = h(m \| a)$, where $m$ is the message to be signed and $h:\{0,1\}^* \rightarrow \mathcal{Z}_{q}$ is a hash function. 
  4. Compute $s = (k + r\cdot x)\ mod \ q$
The signature on $m$ is the pair $(r,s)$.

Verification
  1. Compute $v = g^s \cdot y^{-r}\ mod \ p$
  2. Output $1$ if $v = r$, otherwise output $0$.
Correctness
If $(r,s)$ is a valid signature on $m$, then we have
$v=g^s\cdot y^{-r}=g^{k+r\cdot x}\cdot g^{-r\cdot x}=g^k=r$
Thus the verification succeeds.

3. RSA-FDH
The RSA-FDH (full domain hash) scheme was introduced by Bellare and Rogaway in [3]. It is a RSA-based signature scheme and follows the hash-then-sign paradigm. It makes use of the hash function (the image size of the hash function equals to RSA modulus) to generate random-looking output for the plain RSA signature scheme. Thus it prevents the algebraic attacks on the plain RSA signature scheme and it is able to sign messages of arbitrary length. But it is hard to create such hash function in practice. RSA-FDH can be proven EU-CMA secure in the random oracle model

Key Generation
  1. Select two large primes $p$ and $q$. 
  2. Compute $N=p\cdot q$.
  3. Select a random integer $e$, where $1<e<\phi(N)$, such that $gcd(e,\phi(N))=1$.
  4. Compute the integer $d$, where $1<d<\phi(N)$, such that $e\cdot d = 1\ mod \ \phi(N)$.
The public key is $(N,e)$ and the private key is $(d,p,q)$.

Signing
  1. Compute $s=h(m)^d\ mod \ N$, where $m$ is the message to be signed and $h:\{0,1\}^* \rightarrow \mathcal{Z}_N$ is a hash function.
The signature on $m$ is $s$.

Verification
  1. Output $1$ if $s^e = h(m)\ mod \ N$, otherwise output $0$.
Correctness
If $s$ is a valid signature on $m$, then we have
$s^e=h(m)^{d\cdot e} mod \ N=h(m)\ mod \ N$
Thus the verification succeeds.

[1]http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-4.pdf
[2]http://eprint.iacr.org/2012/029.pdf
[3]http://web.cs.ucdavis.edu/~rogaway/papers/exact.pdf

Friday, January 16, 2015

52 Things: Number 15: Key generation, encryption and decryption algorithms for RSA-OAEP and ECIES.

This is the latest in a series of blog posts to address the list of  '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. We came back to "more crypto" staff by describing the key generation, encryption and decryption algorithms for RSA-OAEP and ECIES. 


1. RSA-OAEP
RSA-OAEP  stands for RSA encryption/decryption scheme and OAEP padding scheme respectively. They are often used conjointly in real world.

1.1 RSA[1]
RSA is one of the earliest public key encryption scheme that has been deployed widely in real world. It is based on the assumption of the hardness of RSA problem which has been described in previous blog (here).

Key Generation:
  1.  Generate two large primes $p$, $q$, and compute the modulo $N = pq$.
  2.  Selects a random number $e \in \mathbb{Z}_N$ s.t. $gcd(N,e)=1$ where $gcd$ stands for Greatest Common Divisor
  3.  Since $N$ and $e$ are co-prime ($gcd(N,e) = 1$), we can use XGCD to find the multiplicative inverse $d$ of $e$ over modulo $\phi(N)$: $d = e^{-1} \mod \phi(N)$.
  4. We distribute $(N,e)$ as our public key and hold $(p,q,d)$ as our secret key.
Encryption:
  1. Parse the message to an integer $m \in \mathbb{Z}_N$.
  2. Compute $c = m^e \mod N$.
  3. Output $c$ as our ciphertext.
Decryption:
Before we receive the ciphertext, we precompute some values: $d \mod p-1$, $q^{-1} \mod p$, $d \mod q-1$ and $p^{-1} \mod q$.
Then upon receiving the ciphertext $c$, we
  1. Compute\begin{equation}m = ((c ^{d \mod p-1}\mod p)q({q^{-1} \mod p})\\ + (c^{d \mod q-1}\mod q)p({p^{-1} \mod q})) \mod N\end{equation}
  2. Output $m$ as our plaintext.
Notice that the computation of $m$ is in fact $m=c^d \mod N$ using CRT. The reason is that performing exponentiations over small modulo($p$ and $q$) are faster than doing it over large modulos($N$). The decryption works by applying  Fermat's Little Theorem and we will have $c^d =m^{ed}= m ^{1 \mod \phi(N)} = m (\mod N)$ .

1.2 OAEP[2]
OAEP stands for Optimal Asymmetric Encryption Padding. It is a padding scheme used together with asymmetric encryption (usually RSA). It can bring some randomness to a deterministic encryption scheme. When used with RSA, the combined scheme is proven to be IND-CCA secure.

Let
  • $f$ be a $k$-bit trapdoor one-way permutation. $f:\{0,1\}^k \rightarrow \{0,1\}^k$
  • $m$ be the $n$-bit message
  • $G$, $H$ be two psudorandom functions: $G: \{0,1\}^s \rightarrow \{0,1\}^{n+t}$ and $H:\{0,1\}^{n+t} \rightarrow \{0,1\}^s$, where $k = n + t + s$
  • $R$ be $s$-bit random number: $R \leftarrow \{0,1\}^s$
Encryption:
We compute the $k$-bit ciphertext as follows:
\begin{equation}
Encrypt(m) = f_{pk}(\{(m||0^t) \oplus G(R) \}||\{R \oplus H((m||0^t) \oplus G(R))\})
\end{equation}

Decryption:
By using the trapdoor, we can recover the following value:
\begin{equation}
f_{sk}(c) =  \{(m||0^t) \oplus G(R) \}||\{R \oplus H((m||0^t) \oplus G(R))\}
\end{equation}

Then
  1.  Let the first $n+t$ bits be $T$: $T = (m||0^t) \oplus G(R)$, and the other $s$ bits be $S$: $S = R \oplus H((m||0^t) \oplus G(R))$
  2.  Compute $R$ as $R=H(T) \oplus S$
  3.  Compute $m||0^t = T \oplus G(R)$
  4. Verify if there is exactly $t$ 0s following the $n$-bit message $m$. If validated, remove the $t$ 0 bits and output $m$.
In practice, we replace $f_{pk}$ and $f_{sk}$ by RSA encryption and decryption function respectively.

ECIES

Elliptic Curve Integrated Encryption Scheme is a variation of ElGamal public key encryption scheme based on Elliptic Curve Cryptography (click here to find more about elliptic curve).

For simplicity, we define an Elliptic Curve in the form:

\begin{equation}
E: y^2 = x^3 + ax +b
\end{equation}

To further simplify our problem, we only discuss a curve $E$ on a prime field $\mathbb{F}_q$ with a base point $P$ having a prime order $n$. Then we can define a simplified domain parameter: $ D = (q, a, b, P, n)$ where:
  • $q$ is the prime field order. i.e. $q$ is a prime and $x, y, a, b$ are reduced to $\{0, 1, 2, ..., q-1\}$
  • $a,b$ are the coefficients of the curve.
  • $P$ is a point on the curve.
  • $n$ is the prime order of $P$. i.e. additions of $P$ yields $n$ points on the curve where $n$ is a prime.
The domain parameter are made public.

ECIES are always associated with a symmetric encryption scheme and a MAC scheme. We denote them as $\{Enc_k(m)=c, Dec_k(c)=m\}$ and $\{MAC_k(m)=t, Very(t,m)=T/F\}$ respectively.

We also denote $KDF(s_1,  s_2) = (k_{enc}, k_{MAC})$ as the Key Derivation Function which takes two seeds $s_1, s_2$ and outputs a pair of symmetric encryption key and MAC key.

Then we describe the scheme as:

Key Generation:
  1. Pick a random integer $d \in [1, n - 1]$.
  2. Compute a new point $Q = dP$.
  3.  Output $Q$ as the public key and $d$ as the secret key.
Then the encryption of a message $m$ is done as follows:

Encryption:  
  1.  Pick a random integer $k \in [1, n-1]$.
  2.  Compute $R=kP, Z=kQ$. If $Z=\infty$ then we restart the process and pick a different $k$.
  3.  Generate $(k_1, k_2) = KDF(x_Z, R)$ where $x_Z$ is the $x$-coordinate of $Z$.
  4. Compute $c = Enc_{k_1}(m)$ and $t=MAC_{k_2}(c)$.
  5. Output $(R, c, t)$ as the ciphertext.
On receiving a ciphertext $(R,c,t)$,

Decryption:
  1.  Verify if $R$ is valid. This can be easily done by substituting $R$ in to the curve.
  2. Compute $Z'=dR$.
  3. Generate $(k'_1, k'_2) = KDF(x_{Z'}, R)$, where $x_{Z'}$ is the $x$-coordinate of $Z'$.
  4. Verify if MAC is correct by calling $Very(t, c)$.
  5. Decrypt $m' = Dec_{k'_1}$.
  6. Output $m'$ as the plaintext.
Since $Z' = dR = dkP = kQ = Z$; therefore seeds fed to KDF() are in fact same. Hence receiver can generate keys as same as the sender's and decrypt the message.

However, my knowledge to ECC is very limited. For those who are interested, you can find more in [4].

References:
[1] http://people.csail.mit.edu/rivest/Rsapaper.pdf
[2] http://tools.ietf.org/html/rfc2437
[3] http://www.shoup.net/papers/iso-2_1.pdf
[4] http://www.springer.com/computer/security+and+cryptology/book/978-0-387-95273-4

Sunday, January 11, 2015

Real World Crypto 2015: Are you there Bob? It's me, Alice.

Social media are all about the three F's: friends, fans and followers. But what if you want to keep your list of F-buddies private? On the final day of the Real World Crypto workshop Ian Goldberg spoke about how we can advertise our presence online to our friends without revealing our relationship graph.

In the 90s, people would fire up ICQ instant messenger and an ear-splitting foghorn sound effect would announce their presence online to everyone on their street. Friends further away relied on notification from the ICQ server, with the server knowing when users are online and who their friends are. In today's more privacy-sensitive climate this is undesirable, as the service operator may be legally compelled to surrender this data to governments on request.

The Dagstuhl Privacy Preserving Presence Protocol P (DP5 - the extra P is for patter) allows a service operator to provide this online presence information without learning the information itself, and so freely comply with search warrants without compromising user privacy.

The DP5 protocol keeps a friendship database and a presence database, and assumes two parties wishing to communicate already a hold shared key. Time is divided into long-term epochs T over which friendship data can be updated, and short-term epochs t over which status data can be updated.

In each long-term epoch T, Alice evaluates two PRFs at the point T under the key she shares with Bob in order to generate a public identifier ID and an epoch key K. She generates a public key P and encrypts it under the epoch key. This ciphertext C and the ID are stored in the friendship database.

In each short-term epoch t, Alice evaluates a third PRF at the point t under a key derived through hashing her public key P, generating an encryption key k. She then encrypts a status message under key k. This ciphertext c and an identifier derived from P are stored in the presence database.

When Bob wants to know if Alice is online, he evaluates the appropriate PRFs at point T under their shared key to recover the epoch key K and identifier ID. He pulls the corresponding record from the friendship database and decrypts the ciphertext C to recover Alice's public key P. He then computes from P the key k and the identifier for the presence database, pulling the corresponding record and decrypting the ciphertext under k. If decryption is successful then Alice is online and her status message is revealed, otherwise he concludes that Alice is offline.

This is a simplified explanation of the DP5 protocol, which makes use of Private Information Retrieval to ensure the server is unaware of the nature of the database queries. PIR is a beast, so the protocol doesn't scale too well, but it does provide a private way to tell your friends you're online and let them know your status.

***** X_Br1sT0L_B10gGeR_X is offline *****

Friday, January 9, 2015

Real World Crypto 2015: “Designers: ask not what your implementer can do for you, but what you can do for your implementer” Dan Bernstein, Real World Crypto 2015

Last year at Real World Crypto you may remember my blog post about how the conference went about the business of lamented the downfall of theoretically secure cryptographic primitives based on practical implementation issues. Well this year at real world crypto things have gone in a similar direction. An example of this is Bernstein's talk on Wednesday entitled “Error-prone cryptographic designs.” Much of the talk has already been described in a previous post by David B here which helpfully gives us the “Bernstein principle” which I'll follow up on, perhaps from a more applied point of view.


As the title suggests the talk was about error prone cryptography, but not as you might expect. The talk was not a frustrated outburst at how the practical implementation side of things was letting the side down and needed to get it's act together, the tone of which you may be familiar with from other talks with similar titles. The talk did contain many of the ingredients of this sort of talk. There was a look at how the problems with cryptosystems seem always to come from the implementation of secure primitives not the primitives themselves, backed up by a number of horror stories showing this for instance. But this talk had one major twist.


The main point of the presentation was made near the beginning in one of the horror stories (see blog post), when he referred to the breaking of the encryption used by Sony Playstation. Relating to this he observed how the blame for the attack got distributed. The attack was a practical attack and so naturally the designers of the primitives remarked how their secure designs had been made vulnerable by the terrible implementers who seem to keep getting things wrong and making all their hard work meaningless. But wait a minute, asked the implementers, if we keep getting what you designs wrong, perhaps the problem is with you? Perhaps the designs of the primitives are so hard to implement properly that the designers are really to blame for not taking implementation practicalities seriously enough in their design choices.
 
So who was to blame for this and many other security slip ups that have occurred through implementation errors? Well the answer to the question still remains unresolved, but the fact that the discussion is taking place throws a different kind of light on the practice of designing crypto systems. The current system of designer make something secure; implementer implement the design securely, may well be where many of the problems are coming from and should be replaced by something more like designer make something that's secure and easy to implement; implementer … you can't really go wrong!


Although unlikely to have drawn the agreement of the whole audience, the talk didn't have the feel of a dig at designers but an appeal to them to wake up to their responsibility to the implementers in making designs that are easy to implement. It neither had the air of trying to get implementers off the hook, but sought rather to look at the question as to why in a world where secure systems seem to break because of practical attacks against implementations rather than the security of the primitives aren't the primitives ever singled out to blame for being hard to implement rather than the people given the task of implementing them.
 
It's true, designers have a hard problem on their hands making sure what they design is secure, but often being from a more mathematical, theoretical background, have they tended to overlook the practical aspect of what they are concocting? Many designers are highly skilled in the art of theoretical design but is this at the expense of not knowing the real engineering world in which they work as well as they should?
 
The talk examined these and other similar questions and was summarised with an appeal to designers to “think of the children, think of the implementers” and ended with the words given as the title for this post.

Real World Crypto 2015: CAESAR came, you see, but who will win?

Cryptographers are humans, and humans are competitive, which might explain the popularity of cryptographic competition. The most visible competition running at this point in time is CAESAR, which hopes to recommend a (portfolio of) authenticated encryption. Elena Andreeva gave a wonderful talk giving a very brief overview of modern thinking regarding what security and functionality to aim for; and how different schemes try to achieve it.

The traditional lesson that encryption should be probabilistic to be secure, has been replaced by a nonce-based view, with further granularity depending on the security when nonces are somehow reused. Even prior to CAESAR there were an increasing number of dedicated, nonce-based schemes suggested, several of which were entered into the CAESAR competition. Taking into account nonces also changes how one would go about using a MAC to add authenticity to a mode-of-operation such as CBC. This problem, known as generic composition, has recently been revisited by Namprempre, Rogaway, and Shrimpton.

There has also been increased interest in the security ramifications of using online encryption and decryption. For efficiency purposes (e.g. to enable a single pass over the data), it can be beneficial to output plaintext as part of the decryption process before the ciphertext has been verified. At last year's Asiacrypt, Elena and her coauthors established a framework to capture the release of unverified plaintext (RUP). Only a handful of CAESAR candidates achieves this security notion.

Finally, Elena went into more detail of the current state of play of the CAESAR competition. Given the large number of candidates (57) and the large number of criteria based on which one can classify an authenticated encryption scheme, the information provided by the CAESAR zoo can be hard to interpret. Luckily for us, Elena has created a wonderful, interactive visualization tool to zoom in on whichever property takes our current fancy (for instance, geographical spread of submissions). The tool is available from her home page so have a go at it yourself!

The first round of CAESAR is drawing to a close and it is expected that next month DJB (J for Julius?) will announce which candidates the CAESAR committee has progressed to round 2. Eventually the winner or winners will be crowned in another two years' time. After her presentation, there was a question from the audience whether this winner will be fast tracked into the TLS standard.

Thursday, January 8, 2015

52 Things: Number 14: What is a cryptographic pairing?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. We build on the previous few weeks by introducing the notion of a pairing.

Pairing definition: Given 3 cyclic groups $\mathbb{G}_1,\mathbb{G}_2,\mathbb{G}_3$ of order $q$ with generators $g_1,g_2,g_3$ respectively. We say a function $e:\mathbb{G}_1\times\mathbb{G}_2\rightarrow\mathbb{G}_3$ is a pairing if the following hold:

  1. [bilinearity] $\forall A,B\in\mathbb{G}_1,C,D\in\mathbb{G}_2$: $e(A+B,C)=e(A,C)\cdot e(B,C)$ and $e(A,C+D)=e(A,C)\cdot e(A,D)$
  2. [non-dengeneracy] $e(g_1,g_2)\neq 1$
  3. [efficiency] $e$ is efficiently computable


Types of pairing: There are 3 type of pairings that will be described below:

  1. $\mathbb{G}_1=\mathbb{G}_2$
  2. $\mathbb{G}_1\neq\mathbb{G}_2$ but there is an efficiently computable isomorphism from $\mathbb{G}_2$ to $\mathbb{G}_1$ and maps the generator $g_2$ to $g_1$
  3. $\mathbb{G}_1\neq\mathbb{G}_2$ and there is no efficiently computable isomorphism


The last two are asymmetric pairings while the first is a symmetric pairing.

A warning on pairings: It feels like I am always having a warning section in each of my blogs but these are important and I feel should be included. In type 1 (and can be shown similarly for type 2) pairings (this doesn't mean type 3 are safe) the DDH problem (given $g,g^x,g^y,g^z$ does $z=x\cdot y$) is easy since you can check if $e(g^x,g^y)\overset{\$}{=}e(g^z,g)$. Another thing to be careful of is that it is possible to make a pairing that does everything you want it to,

Uses of pairings: Pairings have a wide range of uses, including; cryptanalysis, Identity Based Encryption, Attribute Based Encryption and Leakage Resilient Cryptography.

Instantiation of pairings: The only way we know how to instantiate pairings is over elliptic curves (see the last few blogs in the 52 things series) and this is another reason why elliptic curves have become so desirable in cryptography. More recently Multi-Linear Maps have appeared in the literature which work over different groups. However, that is a story for another time...

Real World Crypto 2015: Error-prone cryptographic designs (djb)

Some of the really important ideas can be summarised in one sentence. I'll coin the name "Bernstein principle" for the following:

Do not blame the (crypto) implementor for a mistake that the designer could have avoided in the first place.

Dan Bernstein gave a talk on several crypto horror stories including DSA and HTTPS which are often given as examples of "bad implementations of otherwise good crypto". Instead, we should see designs that have a tendency to be hard to implement or use simply as bad crypto. Nowadays, constant-time and side channel-resistant ciphers are state of the art; Bernstein told us how some of the problems were identified much earlier but ignored, with the inevitable mistakes being blamed on the implementations.

Another important point: a primitive design tends to get a lot of review (think AES competition) but a primitive implementation less so, a protocol design usually even less (reviewing security "proofs" is quite officially out of scope for most conference program committees these days) and once we get to a protocol implementation, there's very little interest around. Put another way, to take Bernstein's example, which of these got the least review: (a) discrete logarithms (b) ECDSA using discrete logarithms (c) Sony's ECDSA implementation for play station code-signing? And which one was horribly broken?

I would personally take the Bernstein principle further and say we should never, ever blame a crypto user for anything - yes, someone who reuses passwords, and picks weak ones in the first place, isn't going to get much security but take a deep breath - do we really expect the average person to memorise two dozen distinct high-security passwords (and change them every six months)? I'd argue that if any security-related design is widely misused be users, even in "stupid" ways, then that is enough evidence to call it a bad design.