Cryptography Primer
The uproar caused by recent leak from Edward Snowden still hasn't receded yet. There are various stances about privacy and government snooping floating around in the internet. Some say they have nothing to hide. Others argue if the government can snoop you, and blackhat hackers can snoop the government, then we're royally screwed because anybody with money can pay the blackhats (or, gasp!, pay some corrupt government officials) to get data about anybody.
Politics aside, security must be taken seriously everytime we use internet. In this post I would like to talk about the basic of security in general computing today.
Basic Cryptography
In general, cryptography is used to transform a string of text into a convoluted gibberish that do not have any resemblance to the original text whatsoever to a person without access to its encryption key. That original string of text is called plaintext, and the resulting gibberish is called cyphertext.
Based on the keys used for the encryption, there are two type of encryption: symmetrickey and asymmetrickey encryption.
SymmetricKey Encryption
In symmetrickey encryption, the key that uses to encrypt the plaintext into cyphertext can be used to reverse the operation. Decrypting the resulting cyphertext using the same key will yield the same plain text. So, if two people can share an encryption key, they can communicate securely using that key. A simple example is sharing an encrypted rar file with your partner, your partner must know your encryption key (the password) in order to decrypt the rar file. So, you can go meet your partner and tell him your password.
What if it's impossible for you to meet your partner physically? Supposedly, the secrecy of the data you want to transmit is very important and the balance of the world would be disrupted if it compromised. How can you tell him your password with a strong guarantee that nobody snoop it in between the transit? Using email? Email is generally transmitted in plain text, anybody in the network between you and your partner can read it. Using google chat? Then google can read it whenever it please. Using text message? Then the cellphone operator can read it. Also, gsm security is considered broken and anyone with the right equipment can intercept your message. It's time to use an asymmetrickey encryption.
AsymmetricKey Encryption
In asymmetrickey encryption, you have two keys instead of one: the private keys (which you should guard with your life) and the public key (which you should share to anyone and their pets). So, why does it require two keys?
Anybody who know your public key can encrypt a plaintext that only you can decrypt (using your private key). Consider our example above (in which you need to transmit a super secret data to your partner). You can encrypt the password with your partner's public key, and send him the resulting cyphertext. Using his private key, he'll decrypt the cyphertext and get the password. Then you can send him your password protected rar archive to him.
Wait, why use rar? Why not simply encrypt the data using the public key instead?
Well, actually you can use anything that properly encrypt the data. 7zip support AES encryption (one of the commonly used symmetrickey encryption) too, just like rar. The reason why we didn't encrypt the whole data (presumably we have 1.3GB of it for the example) using the asymmetrickey encryption is it's really slow compared to the symmetrickey encryption. It is much more efficient (and faster) to encrypt the data using symmetrickey encryption and send the key/password through asymmetrickey encryption instead of encrypting the data wholesale using asymmetrickey encryption. Over time, the latest advances in cryptography will make the asymmetrickey encryption as fast as the symmetrickey encryption.
Can We Break The Encryption?
Random Number Generator
Cryptography relies heavily on random numbers. For example, you don't want your keys to be easily guessable by anyone, you want it sufficiently random that the odds of successfully guessing your key is so low nobody even try to guess it (that's why you should favor randomly generated key over simple memorable string).
But how do we obtain a truly random number from inside a computer? A computer by itself cannot generate a truly random number. A computer is a state machine, so in theory, if you know the machine state at the time the secret random number generated, you might be able to guess that number.
A simple way to generate random numbers is the Middle Square Method. First, pick a starting 5digits value as a seed, for example: 12345. Next, compute the square of it: 12345 * 12345 = 152399025. Next, pick the middle five digits from the result: 23990. That's our random number. To get the next random number, just repeat the process using the previously generated number as the new seed: 23990 * 23990 = 575520100 > 55201, and so on.
Notice that the value of generated random number depend on the previous number. If you know the original number that used as the seed, you can easily predict the random number generated using this method. This random number generator is called PseudoRandom Number Generator (PRNG) because it doesn't actually generate truly random numbers. Note that the Middle Square method is very simple and not used in modern system anymore, but the concept is still the same: pseudorandom number generator depends on seed number and internal states to generate a number. If the seed and the internal states is known, then the anyone can easily predict the next random number. (in Middle Square method, there is only one internal state variable and it's always set to the previously generated number).
A pseudorandom number generator that can generate truly unpredictable random number is called Cryptographicaly Secure PseudoRandom Number Generator (CSPRNG). One requirement for CSPRNG is nobody should be able to predict the next number by analyzing the previous numbers. Since we can predict the next number generated by the Middle Square method from the previous numbers, we can clearly see that the Middle Square method is not cryptographically secure and should not be used in real life applications.
Cryptographicaly Secure PseudoRandom Number Generator must be fed with multiple sources of randomness to ensure its unpredictability. That source of randomness could be a network interface, human input, harddrive needle position, ambient city noise, or even cosmic rays. The more sources of randomness available, the better.
Starting from the Ivy Bridge processors, Intel includes an onchip random number generator along with a new instruction to make uses of it, RDRAND. It can generate a random bit for each clock cycle. Modern operating systems use the instruction as additional source of randomness for their builtin random number generator.
There are concerns regarding the use of RDRAND in linux kernel. Since RDRAND uses an onchip random number generator, somebody at processor manufacturing plant could replace the chip with the one with faulty random number generator, which would potentially compromise security on linux system.
Should we worry that an exploit to random number generator could compromise our security? Probably not, but keep it in mind though.
Recently, a bitcoin exchange got hacked because the attacker could take advantage of its dedicated server provider's password reset link. Apparently, the password reset link was not random enough and can be guessed by attacker.
Tips: if your application is running under Linux, always use /dev/urandom to get your random numbers. Many people don't like it because it's slow, but keep in mind that /dev/urandom is cryptographically secure. If you're getting the random numbers from a builtin function on your programming language/library/framework of choice, be sure to check the documentation to see if it's cryptographically secure. Not all languages/frameworks pull the random number from /dev/urandom.
See also:
 The Factoring Dead: Preparing for the Cryptopocalypse
 Crypto experts issue a call to arms to avert the cryptopocalypse
 The code monkey's guide to cryptographic hashes for contentbased addressing
 A Stick Figure Guide to the Advanced Encryption Standard (AES)
 random.org
 Random Number Bug in Debian Linux
 Petition to Linus Torvalds: Remove RdRand from /dev/random, discussion
 Why secure systems require random numbers
 Javascript Cryptography Considered Harmful
Quantum Computing
Quantum computing has gain a lot of buzz recently, especially those DWave stuff. But what is quantum computing and what is the implication for our daily (internet) life?
Quantum Superposition
Remember SchrÃ¶dinger cat? It's often used to illustrate quantum superposition. The cat is in the box, with poison and radioactive trigger that have 50/50 chance of releasing the poison to the poor cat. In the end, is the cat alive or dead? Not that simple. The poor cat is in a superposition of state and both alive and dead. The moment we take a peek to see how the poor cat's doing, the quantum superposition collapsed and the cat falls into one of the two possible state: alive or dead.
That cat analogy doesn't make any sense, right? How could the cat both alive and dead at the same time before we take a look at it? But it highlights an important feature of quantum superposition: the quantum superposition state collapse into in one of the possible states the moment we measure it.
Qubit
Ok, the quantum superposition is neat because it can represent multiple states simultaneously. But what's that got to do with quantum computing?
The building block of traditional computing, as we all know very well, is bit. A bit can represent two states: either 1 or 0. In quantum computing, the building block is qubit (quantum bit). Because qubit has quantum superposition property, it can be in multiple states at the same time; it can contain both 1 and 0 until the moment you try to measure it, at which point it would collapse into either 1 or 0. This is indeed truly mind blowing.
Each qubit can have both 0 and 1 simultaneously, and each state has its own probability coefficient. To describe a qubit, we would need two numbers to store probability coefficients for 0 and 1. To describe two qubits, we would need 4 numbers, and so on in n^2 relation. This illustrates the strength of quantum computer: we would need a traditional computer capable to store 2^100 numbers to represent a 100 qubits quantum computer. That means we need millions of yottabyte just to represent a mere 100 qubits quantum computer!
Quantum Teleportation
Another bizzare phenomenon is quantum teleportation. After a pair of particles interact with each other and separated, if one particle has its state changed, the other would have its state changed too, no matter what distance they are separated. It is as if the particles can sense what happen to its pal and react accordingly, just like in horror movies.
Again, what's that got to do with quantum computing?
In 1994, a researcher from AT&T, Peter W. Shor found a way to use quantum teleportation to find prime factors of an integer. It turns out to be much faster than any traditional computer can compute. It is now known as Shor's Algorithm.
In asymmetrickey encryption, the public and private keys must be somehow related for the encryption algorithm to work. Therefore, the private keys can be recovered with some forms of factorization from the public key, except doing so is computationally hard, and even virtually impossible (takes too much time, like billions of years) if the key is sufficiently long.
With a sufficiently big quantum computer, factoring private key from public key is feasible using Shor's algorithm. The task that could takes billions of years now can be accomplished in a couple years, for instance. That's why cryptography researchers now scramble to produce new cryptography algorithms and methods in the event that quantum computer is finally big enough to pose a threat for cryptography world.
Forward Secrecy
Later
Securing Email
There is no doubt that email plays an important role in our internet life. We can't even register for a new account on many website without email! But how does email really work actually?
Simple Mail Transfer Protocol
The SMTP (Simple Mail Transfer Protocol) is used by email servers to exchange emails to each other. In fact, OSX and most linux distributions ship with sendmail
, an email transport agent. If you're on a Mac or Linux, open a terminal and type the following code to send yourself an email (replace my email address with yours):
1 2 3 4 

Soon, you'll receive an email from <username>@<hostname>
(example: [email protected]). If the command run successfully but you never receive any email, chance that:
 Your ISP block communication to port 25 to stop spambots
 Your email provider (gmail, yahoo, etc) ban your ip range (possibly due to spambots, damn spambots!)
If you did receive the email, you might not be able to reply to it unless you have configured your hostname properly.
In the above example, my computer act as an email server and communicate directly to sainsmograf.com's mail server. Note the email server part. If you're using Outlook or Thunderbird to connect to smtp server, your computer act as a user of that server. Here, sendmail
act as an email server delivering email from its user (me) to another email server (sainsmograf.com).
Can Somebody Snoop My Email?
By default, sendmail
uses unencrypted protocol. The email sent using the sendmail
command in the previous section is not encrypted, and anybody between you and your destination server can easily read your email. The good news is, sendmail does support SSL and can encrypt your email messages during transmission. However, sendmail's SSL encryption won't protect your email if the recipient access his mailbox via unencrypted connection (for example, plain old POP3 without SSL).
You may need to encrypt your email message yourself to guarantee that nobody snoop your email, but how?
Pretty Good Privacy (PGP)
A popular way to encrypt your email messages is using PGP. A widely used implementation of PGP is GNU Privacy Guard (GnuPG, or GPG). If somebody asks you to use PGP/GPG, don't be confused. What he means is you should use GnuPG to exchange PGPencrypted message with him.
After getting GPG installed (OSX, Linux, Windows), lets use it to encrypt our email!
Generate Private/Public Keys Pair
PGP uses asymmetric encryption (discussed above), so the first logical step to encrypt your email is generating your public and private keys. Run the following command from your terminal to generate your keys pair:
1 2 

You'll see the following output:
1 2 3 4 5 6 7 8 9 10 

Select the default by entering 1
. You'll be prompted another questions. Just answer them accordingly. Eventually, GPG will ask you to create a passphrase:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 

Now, enter a long passphrass, but don't forget it! If you forgot your passphrase, there is no way to recover it and you can never decrypt all messages that has been encrypted with your pulic key. Consider using a good password manager to store your complicated passphrase.
Now that you have your own private and public keys, you should share your public key to everyone! To print your public key, use this command (replace my email address with yours):
1 2 

And the result:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 

There, you have my public key. Save that key into a file called public.key
and import that public key using the following command:
1 2 

Now it's time to send myself a secret email! To encrypt a message, first, save the message inside a file called plain.txt
. Encrypt the file using the following command:
1 2 

Replace [email protected]
with your email message and [email protected]
with your recipient. Make sure to import your recipient's public key first!
The encrypted message would be stored inside plain.txt.asc
file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 

Now send that file via email to your recipient. Here is how to send the encrypted message using sendmail
:
1 2 

To decrypt the message, save the message content into a file called cipher.txt
and run:
1 2 

It would print out the decrypted message:
1 2 3 

You can encrypt binary data too, or use it to encrypt your offsite backup data. Check the man page (man gpg
) or consult the following resource:
That's just an overview of how PGP works. If you use an email client or install an email client plugin that support PGP, after creating your public/private keys pair, the process is mostly automatic. No need to go back and forth to the command line interface!
S/MIME (Secure/Multipurpose Internet Mail Extensions)
Almost the same with PGP, except you don't generate your public/private keys pair yourself. Instead, you obtain them in the form of digital certificate from a certificate authority.
When you send an email signed with S/MIME, your recipient will automatically get your public key. Also, if you obtain your certificate from trusted authority, such as VeriSign, your recipient's email client will automatically validate your message with no manual process involve. Also, unlike PGP, most email clients support S/MIME.
Simply obtain a certificate from a trusted certificate authority, install it, and ready to go! The drawback is you need to pay to get a certificate. You can get a certificate with one year validity from VeriSign here (about $20).
Next:
Verifying Website Security
Is the website you visit frequently actually secure? Are you sure the website you visit is actually the real website, not some hacker rig impersonating the real website?
 EFF has a nice diagram about connection privacy. Lets discuss it!
 Anything without HTTPS is insecure. Don't submit important information over plain http!
How SSL (HTTPS) Works
 Initial handshake uses asymmetric encryption to exchange symmetric keys. Therefore HTTPS requires two roundtrip to server. SPDY protocol solve this (but chromeonly).
 Validation: The connection might be encrypted, but how can you be sure that the guys on the other side of the cable are not an imposter? Someone I trusted should confirm that I'm indeed not talking to a fake imposter.
 Forward Secrecy
Deep Net
 We need to go deeper.