Feb 23, 2021 - Category: Cybersecurity

An Introduction To Cryptography

Cryptography is embedded almost everywhere in our daily lives, even if we are not aware of it. Cryptography is like a secret agent that works non-stop to protect our data from being exposed, leaked or simply robbed. It protects, for example, the messages we send on most messenger apps, it protects our personal data stored at our bank, and it even helps you to connect to the internet. Life would work without cryptography, but it certainly would not be a secret anymore what you ordered for lunch yesterday via your favourite delivery app, or worse, how much money you have in your bank account, if any of it would be left after a hack.

Though before diving deeper into this topic, we need to clarify that cryptocurrency and cryptography are two very different things. Cryptocurrency is a digital currency using a technology which relies on cryptography. Cryptocurrencies without cryptography wouldn't be safe, nor work the way they do as they rely on a model of key cryptography, where a user has a public key (normally a long sequence of letters and numbers in the user's wallet) and a private key, which is used in order to receive the money. In contrast, cryptography is an ancient art, according to Simon Singh, in which some of the first events of secret writing was performed by Herodes and Xerxes thousands of years ago. The idea of hiding something in plain sight is old, but as science and the mathematics behind it evolved, so did cryptography. It evolved from extremely simple techniques (like writing a message on the shaved head of a messenger, waiting for the hair to grow to obscure it) to some heavy mathematical schemes with elliptical curves and encryption with random functions.

Today we can generally differentiate between three different categories of “secret writing”:

  1. Hashing
  2. Encoding and
  3. Encrypting,

being different in their functionality and used for very specific cases. Unfortunately, people not possessing the necessary knowledge are often mixing them up or are simply using them in the wrong way. Therefore, let's take a look at what are the main differences between the three:

Encoding is the simplest of the three methods in which the main idea is to transform data in order to be used by other systems. Here, it's important to say that encoding does not guarantee the safety of a message. To get the original text of a message for example, the algorithm just needs the encoded text which can be retrieved easily as there is no key involved which can prevent the encoded text to be reversed. The most popular encoding schemes are for example ASCII, unicode, URL or Base64 encoding. Base64 is often mistaken as cryptography as it is a type of encryption, though it should not be used for security purposes. Here the Base64 alphabet excerpted from RFC 4648:

body1.webp

Hashing on the other side is a one way street, being a mathematical function in which the output is always the same for a given input. Contrary to encoding, it is impossible to revert a hash function to its original message. In other words, if two messages contain the same data, their output will be the same, but even if one piece of the message is different, the hash function will generate a totally different output. Nowadays, hashing is broadly used to guarantee the integrity of data. Let's take the example of Alice who wants to send a message to Bob. When sending it, she wants to guarantee that the message has not been tampered with, so she is using a hash function which generates a code. Once Bob receives the message he will do the same and check if the hash codes are the same, ensuring that the message is the original one.

Another common use for hashing is to store highly sensitive information that doesn't need to be consulted in the original form. One scenario is for example to hash user passwords to avoid their storage. This way it is assured that when the user inputs a password, it will be hashed and compared with the original code stored in the database. Given that the property of a hash input will always generate the same output, it is guaranteed that just the right password will match. Hashing is based on the assumption that one input will generate only one specific output. However, there are some functions which can generate the same output for two different inputs, creating a so-called collision. That's why the most common attacks performed on hash functions are collision attacks. If this happens, it means that two passwords can authenticate a user, or that a message can be modified and still generate the same hash. The hash functions which have a low probability of generating collisions are called cryptographic hash functions.

The most famous hashing algorithms are from the SHA family (SHA1, SHA2, SHA5), MD5 and HMAC, although SHA1 and MD5 are not recommended anymore, as they are proven to be able to be broken. Here an example of how hashing works using SHA2:

body2.webp

**Encryption **on the flipside is a two way street being composed of a message, a key, an encryption and a decryption function. The encryption function will just encrypt, creating an encrypted message, and the decryption will just decrypt, taking the encrypted message and turning it back into the original form. The fundamental idea of cryptography is to provide confidentiality in which the usage of a key can guarantee that just the key owner has access to the message.

Going one level deeper, there are two main categories of cryptography: the symmetric and the asymmetric. The difference between both is that symmetric encryption uses the same key to encrypt and decrypt, while the asymmetric encryption uses a public key to encrypt, and a private key to decrypt the message, whereas the most common algorithms used in cryptography are AES for symmetric and RSA for asymmetric encryption.

Here is how symmetric encryption works:

body3.webp

Though independently of the algorithms, keeping the key safe as well as having strong and unique keys are the two main goals of modern cryptography, besides using tools like key derivation and key management as well as other forms of generating keys via the elliptic curve for example.

But these are topics for future posts, one step at a time.