Cryptography is something that we all depend on at least indirectly (it protects your online banking activity and your credit card information, passwords, and other sensitive information), but it is a complex topic and having to navigate around it as a beginner is always unpleasant. Cryptography is one of the most crucial facets of cybersecurity.
This guide is for people who want to learn the basics of cryptography — as in its purposes and to help identify what kind of cryptography a project may need or just to get a basic understanding of what you’re working with. This guide is not intended to teach you how to implement cryptography.
Cryptography is the science of converting data into unreadable formats. However it isn’t always for the sake of protecting private data. It is often used to generate unique codes to identify records, and to verify the authenticity of devices, apps, among other uses. There are many different algorithms and cryptography implementations. Two of the most widely used are hashing and encryption.
Encryption
Encryption is the act of converting data into an unreadable format with the intent to convert it back into a readable format at some point. This is needed to protect databases from prying eyes such as credential thieves. It is also used to protect data in transit.
For example: Encrypting your credit card number before sending it off to an online store you’re ordering from. It is not usable in its encrypted format, which is why it is transmitted in that form to prevent interception, and then decrypted when it arrives at its destination.
Common encryption algorithms include, but are not limited to:
- AES-128.
- AES-192 (stronger).
- AES-256 (strongest).
- RSA-2048 (often used for encrypting keys instead of user data itself).
Encryption is commonly used to protect:
- E-mails.
- Instant messages.
- Databases.
- Passwords and other credentials such as card and account numbers.
- KYC data such as ID numbers, IDs themselves, and more.
- Wi-Fi connections.
- Transmissions over the Internet in general (SSL, although not all websites have it).
- VPN connections.
Encryption is often conducted using a pair of keys. One is a private key and the other is a public key. If the public key is used to encrypt data, while the private key is required to decrypt the data — it is called public-key encryption. The private key is (should) only be in the possession of the owner, while the public key can be revealed. In general, you should never share a private key with anyone to avoid compromising security.
Encryption can also be used to verify the authenticity of a sender (of an e-mail, IM, or HTTP request). For example, if only an app user and the server have the user’s keys, then decryption could only succeed if the correct key is used to send a request or other data to the server (i.e. the legitimate owner needs to be the one to send it).
Hashing
Hashing is a cryptographic concept that generates a string of unreadable characters using provided data. That string is called a hash and is often used for checksums and to verify the authenticity of data or apps. Hashing differs from encryption in that hashes are not meant to be decrypted.
The point of a hash in the case of checksums is to ensure data integrity. Common hashing algorithms such as SHA256 use one-way compression functions to generate a unique string of characters based on sensitive information provided. If they were reversible, that would help thieves to ascertain that information (not good!).
A common example of this is if you download a file and the source provides you with an md5 checksum (or SHA256 checksum) to verify that the file was not corrupted during the download process. You would do this by running the same hashing algorithm on the file you downloaded to ensure that your copy matches the original. A hash may also be used to verify the authenticity of an HTTP request or even a person’s KYC data.
Hashing may also be used to protect passwords stored in databases. A ‘salt‘ may also be added to a string before hashing it to make it more unique. This process (salting) makes it harder to guess the original string. Note that not all algorithms are suitable for hashing passwords!
Common hashing algorithms include:
- MD5 (Message-digest 5).
- SHA-256 (Secure Hashing Algorithm).
- SHA-512.
- Scrypt.
- BLAKE2b.
Example of a hash: A person’s info was stored in a text file, and the file was hashed using SHA-256 (in this case, the ‘sha256sum’ command in Linux).
Name: Jane Doe.
ID Number: 1456948.
Address: 42 Beetle Drive, New Hampshire.
The SHA-256 hash of that is: 619f46010f6f58efa985d90e37e8b5130e8a809459da63715228ac92da9920db
You can store that hash somewhere safely and then change or delete a portion of Jane’s information to simulate data corruption. If you run the hashing algorithm on that ‘corrupted’ data again (even if you change only one letter), the resulting hash will be different from the first. This is why hashing is useful to determine if someone tampered with someone’s records, account, or if the data was corrupted by faulty software or equipment.
Generally, you can implement encryption or hashing fairly easily using libraries available for various programming and scripting languages. A few examples are:
- Crypto provides both encryption and hashing in Node.js.
- Libsodium provides both encryption and hashing in C, C++, and Node.js.
- Libhydrogen provides both encryption and hashing in C (useful for providing encryption on embedded systems, or hashing on embedded systems).
- PyNaCl provides encryption and hashing in Python.
- wolfSSL provides cryptography for embedded systems such as microcontrollers, routers, IoT devices, and more.