Friday, November 14, 2014

52 Things: Number 6: How can we interpret NP as the set of theorems whose proofs can be checked in polynomial time?

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. We continue the Complexity Theory section with an alternative definition of NP...

This question is very much a follow up question to the previous one. Last week Guy answered the question of What is meant by the complexity class $\mathbf{NP}$?'', while this week I will be answering the related question of How can we interpret $\mathbf{NP}$ as the set of theorems whose proofs can be checked in polynomial time?''.

Now to me this is a more intuitive definition of what it means for a problem to be in $\mathbf{NP}$. Not only is it a more intuitive definition but it should (hopefully) also be clearer as to why this is a complexity class that is useful both for cryptography and the wider world. Now before we go into what we can use the class of problems for, the definition is as follows:

$\mathbf{NP}$ is the class of languages that have polynomial time verifiers.

OK but what does this actually mean? Basically if we have an element $x$ and we want to know if $x\in L$ (where $L$ is some $\mathbf{NP}$ language) we have a prover $P$ which given $x$ outputs a witness $w$, this may take exponential time to find $w$ given $x$. Then if we give $x$ and $w$ to our verifier $V$, $V$ can tell if $x\in L$ in polynomial time.

This definition seems very different from the one given last week but they are in fact equivalent. Informally they are equivalent because the witness $w$ can just be the sequence of decisions the NDT made at each branching point to show that $x\in L$. For a (slightly) more formal proof of their equalence [1] is a reasonable online source (with references to textbooks).

So why might this be useful in cryptography? Well essentially we have a class of languages which can take exponential time to check if you do not know a witness but with a witness it can be done in polynomial time. This has the feel'' of a lot of cryptographic algorithms - take Encryption (formalisation to follow in future weeks' blogs) for example; you want it to be exponentially hard to get the message from ciphertext without the key (similar to a witness) but with the key you want it to be efficient (polynomial time) to extract the message.

A warning: While it sounds like a good move to use an $\mathbf{NP}$ problem for cryptography it may not be as simple as it first appears. This is because languages are in $\mathbf{NP}$ based on the worse case complexity where as for encryption we need it to be hard on average. For example we may have an $\mathbf{NP}$ language which has one element that takes exponential time to solve but all others are really efficient to solve - this will not make a good basis for an encryption scheme because we want encryption to be secure for all messages not just one.

Now I am aware that integer factorization isn't known to be $\mathbf{NP}$-complete and isn't known to be in $\mathbf{P}$ (See Ryan's blog here for a description) but it makes for a nice example of the point I am trying to make about choosing your $\mathbf{NP}$ instances carefully. In general finding a factor of a number is easy (half of them are divisible by two!) but if we choose something sensible we can make it hard to factor. Let us focus on numbers of the form $N=p\cdot q$ for $p,q$ prime (a.k.a numbers with only two proper factors). Now if either of these numbers is small then it is again easy, so we want the numbers to be of equal size (this is what we do for RSA). From this it is possible to build an encryption scheme over the top.

[1] - http://en.wikipedia.org/wiki/NP_(complexity)#Equivalence_of_definitions