When it comes to cryptography, block ciphers like AES are pretty important. Block ciphers are a fundamental building block – and they do a lot more than just encrypt. Many random number generators, and even hash functions like SHA-2 rely on block ciphers for their security. So, let’s take a look at what block ciphers do.

But back up a bit – let’s start with something simpler. Many newspapers have these puzzles called “cryptograms”. A cryptogram puzzle has a message encrypted with a simple substitution cipher (a design that was deprecated, like, 500 years ago). Substitution ciphers swap each letter of the alphabet with some other letter. An encoding/decoding chart might look like this:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
P G B K V Y W F J E U I H L X D Q S N Z T A O R M C

Encrypting with a substitution cipher means processing the message one letter at a time, swapping each letter out for its counter part. In theory, this means that only people with access to the decoding chart would be able to decrypt the message. But nah, basic letter frequency analysis and a bit of logic is enough to get the plaintext.

What’s interesting, though, is that this basic idea of substitution is actually still pretty important in modern cryptography – we’ve just made a few upgrades. Think about how a substitution cipher might be implemented:

for (int i=0; i<inputString.length; i++) {
	outputString[i] = encryptLetter(inputString[i]);
}

Basically, just take each letter and pass it through a function called encryptLetter (functional programming fans, I agree with you: pretend I used map). encryptLetter is a simple function that looks up the corresponding “encrypted” letter using the encoding/decoding table. Here’s how the function looks in action:

encryptLetter('A') = 'P'
encryptLetter('S') = 'N'
encryptLetter('Q') = 'Q'

This type of function has a name in cryptography: it’s called a random permutation.

Imagine you have some list of things, maybe a list of the 26 letters. Now, duplicate that list and shuffle the new list’s contents into a random order. Pair up the elements from the two lists: 1st elements together, 2nd elements together, etc. Congrats! You’ve got yourself a random permutation.

That’s a pretty good mental model for how random permutations work, but modern cryptography makes things a bit more complicated. Instead of using random permutations on the alphabet, modern cryptography typically works with the set of 128-bit integers. Since there are 340,282,366,920,938,463,463,374,607,431,768,211,456 (i.e. 2^128) different 128-bit integers, that creates some practical problems. Problems like “how do we shuffle a list that contains more data than the human race has ever created?”.

Fortunately, cryptographers have a pretty good solution to this problem. Instead of using truly random permutations, we can just pretend. If we can make a function that walks, swims, and quacks like a random permutation – to the point that even a skilled attacker couldn’t tell the difference – then it won’t really matter that it’s not truly random. We call these pseudorandom permutations or “PRPs”.

But PRPs don’t exist in isolation. PRPs come in a family – a collection of functions. Each PRP in a PRP function family is identified by a unique “key” (hint hint!), which is typically a 128- or 256-bit integer. The basic security guarantee that PRP families provide is that nobody can tell the difference between a PRP and a true random permutation – unless they have the PRP’s key (or impossibly enormous computational resources).

Now, back to block ciphers. A block cipher is an algorithm that dreams of being a PRP function family. There are many ways to design these algorithms. Threefish, a recent block cipher design from the SHA-3 competition, uses integer addition (with overflow wrapping), bitwise rotation, and bitwise XOR to “mix up” the inputs in a way that is difficult for an attacker to predict (without knowledge of the key). Typically, this mixing process is repeated many times to ensure security (these are called “rounds”). In future posts, I’ll explore the actual details of various block cipher designs.

The weirdest thing about block ciphers is that we have no mathematical proof that any of the block ciphers in use today are actually secure. In other words, none of them have been proven to be a PRP family. Even worse, there’s no proof that PRPs actually exist at all. On the other hand, many block ciphers (like AES) have been extensively analyzed, and no one has proven that they are not a PRP family. That doesn’t eliminate the possibility that some future work will show them to be broken, but it’s a pretty good sign, and many prominent cryptographers suspect AES may never be broken.