Showing posts with label Theoretical Computer Science. Show all posts
Showing posts with label Theoretical Computer Science. Show all posts

Sunday, 6 June 2021

Looking Deeper Into the World of Cryptography

Although the words cryptology, cryptography and cryptanalysis are used interchangeably — strictly speaking — they mean different things. Nowadays, practically we only use the word cryptography for everything. Today’s post not only aims to point out the differences among them but also to show their connections to each other.

To begin with, cryptology is the mathematics, algorithms, and the applications of formulas that underpins cryptography and cryptanalysis. The world of cryptology goes from basic foundations in cryptography (code-making) to modern algebraic cryptanalysis (code-breaking). So, cryptology is clearly divided into two major parts: cryptography and cryptanalysis, and there are strong connections to each other. Basically, the most important topics related to each other include cryptographic applications, types of cryptography and their algorithms, code-breaking techniques, information theory, number theory and mathematical applications to encrypt data and also break ciphers.
Looking Deeper Into the World of Cryptography

To illustrate, some main cryptography topics are about legal and ethical issues, types of cryptographic algorithms, cryptosystem components, cryptographic applications, classical cryptographic algorithms and their methods of breaking, number theory and information theory; whereas cryptanalysis include code-breaking algorithms for block ciphers, stream ciphers, hash functions and message authentication codes, and finally goes a bit deeper into algebraic cryptanalysis on modern approach such as algebraic coding and optimization, algebraic attacks, polynomial systems and satisfiability, Boolean functions, and algorithms for solving polynomial systems.

Note that there a lot of research being undertaken out there about post-quantum computing and its use for code-breaking. By using post-quantum computing will be highly likely possible to break any modern cypher, but there is no proof of that yet. The best is yet to come!

Cryptography has raised many human rights issues, ethical issues and also legal issues due to the fact that it can be used for good purposes but also for bad ones. You can read more about this on this post: Cryptography is Everywhere.

With respect to classical algorithms, some can be done by hand allowing to get a a better understanding of what is involved, such as shift cipher, substitution cipher, Vigenère cipher and permutation cipher. Most of them are outdated, however, many ideas are still used as components of modern ciphers and also many of methods of breaking these ciphers are valid for modern ciphers as well, for instance, exhaustive key search, letter frequency and other statistical properties of the text, and known/guessed plaintext.

It is widely agreed that cryptographic algorithms are basically classified into two big categories depending on the number of keys that are used for encryption and decryption, that is, symmetric cryptography and public-key (asymmetric) cryptography. The first one uses only one secret key for encryption and decryption, whereas the latter uses two keys, a public key for encryption and a private key for decryption. This aspect is particularly important when it comes to working on breaking cryptographic algorithms because it is assumed some things in terms of what is available to the attacker such as the algorithm in use and whether the key is the same or not.

Similarly, symmetric cryptography can be classified into two parts: block ciphers and stream ciphers. They mostly recommend using the block cipher AES, and explains the uses of stream cipher in different applications. For example, RC4 for WEP, WAP, SSL, TLS1.2, A5 for GSM mobile phones SNOW and ZUC for 4G LTE, ChaCha20 for TLS1.3. Some big names in cryptography such as Golbreich and Stallings clearly mention in their books the importance of having a good understanding of the different operation modes of symmetric cryptography such as ECB, CBC, OFB, CFB, CTR, and GCM.

Block ciphers and stream ciphers have to face multiple types of attacks such as differential attacks and algebraic attacks, respectively. Also, several works have been done to prevent differential, the cube and the AIDA attacks, many other types of attacks, the vast majority of which are painstakingly detailed in Antoine Joux’s book “Algorithm Cryptanalysis”.

Stream ciphers have strongly relationship with certain concepts such as one-time pad, perfect secrecy, pseudo-random numbers generators, linear feedback, theory about shift registers (LFSR), NLFSR, shrinking generator, T-function, and IV. There is a special book that presents these aspects in a very detailed manner such as Stallings’ book in which he goes much deeper (Stallings, 2017).

Besides, hash functions and message authentication codes (MAC) are also crucial in cryptography. They are used in tandem with symmetric and public-key cryptography to achieve confidentiality, authentication, integrity, non-repudiation. In fact, Douglas Stinson in his book “Cryptography: Theory and Practice” explains clearly and concisely about hash functions and MAC, giving theorems and algorithms along with other related-mathematical applications.

Talking of public-key cryptography, algorithms in this area are classified depending on the number theory they use. For instance, RSA uses integer factorization whereas Diffie Hellman (DH) uses discrete logarithm and Elliptic Curve (ECC) is based on elliptic curves over finite fields. Actually, Antoine Joux explains RSA, DH and ECC from the mathematical and algorithmic cryptanalysis perspective (Joux, 2009). In addition, it is vital to have a very good understanding of number theory when working with public-key cryptography because the implementation of this type of cryptosystem is entirely based number theory, so, for example, breaking RSA is all about finding solutions for integer factorization.

With regard to cryptanalysis, it is all about understanding cryptographic algorithms and finding solutions to break them. In this area, the most famous techniques to attack all the types of ciphers go from the very classical approach to the most sophisticated and modern techniques which make use of algebraic cryptanalysis and mostly based on Polynomial Systems (especially Multivariate Quadratic), Algebraic coding and optimization.

Essentially, algebraic attack is based on the main idea of finding and solving a system of multivariate polynomial equations over a finite field, for example, some algebraic attacks have been successfully carried out against various stream ciphers based on LFSRs. In addition, Gregory V. Bard in his book “Algebraic Cryptanalysis”, details all the mathematical aspects behind it, particularly topics such as Gröbner bases algorithms, linearization, the XL algorithm, complexity calculation, converting ANF to CNF, NP-Complete problem, MQ Problem, linear algebra over GF(2), Boolean matrices, and GF(2)-Matrix Operations.

It's all for now. I do hope to have cleared some things up. If you have any questions please do let me know in the comment section below. Have a good day!

Monday, 26 April 2021

Curious About Cryptographic Boolean Functions?

In order to have a good understanding of cryptographic Boolean functions, let's get started from scratch, that's, having a look at the very basic concepts. To begin with, all modern computers are composed of very basic logic circuits using very basic gates, called operators which only apply to binary numbers, in other words, 0 and 1. Each type of gate implements a Boolean operation. The finite field ${\mathbb F}_{2}=\{0,1\}$ is also called binary field, and it is of special interest because it is particularly efficient for implementation in hardware or on a binary computer. Using these gates, the rules of Boolean algebra may be applied to design circuits that perform a variety of tasks. For example, integrated circuits. Then, these circuits are all put together to build into more powerful modern computers. 

The logic gate exclusive-OR
Primarily, the minimum units of electronic devices are called cells. Each cell can have two status: high voltage and low voltage. High voltage stands for TRUE Boolean value and is represented by binary value 1, whereas low voltage stands for FALSE Boolean value and is represented by binary value 0. There are three fundamental logic gates or operations on binary numbers such as OR, AND and NOT, to connect the cells in terms of their assigned values and to create different sorts of logic circuits. Composite logic gates can be built and expressed in terms of the above-mentioned ones, and in this sense, there is a very particular composite logic gate that is heavily used in cryptography, this is the logic gate exclusive-or (XOR for short), denoted by $\oplus$, which is expressed as $A\oplus B=(A\lor B)\land(\bar{A} \lor \bar{B})$. 

Interestingly, the final outcome of $A \oplus B$ coincides with the modulo $2$ addition of the binary values of $A$ and $B$, over the finite field ${\mathbb F}_{2}=\{0,1\}$. Also, the logic gate OR also coincides with the multiplication operation over the finite field ${\mathbb F}_{2}$. Note that the basic logic gates $OR$ and $NOT$ can also be represented using the logic gates $XOR$ and $AND$, i.e., $A \lor B=A \oplus B \oplus (A \land B)$; $\bar{A}=A \oplus 1$. As a result, all these facts lead us to work using functions of logic circuits expressed in terms of $XOR$ and $AND$ operations, called Boolean functions, and the logic cells are represented by something called Boolean variables. Also, these two Boolean operations can be treated as operations over the binary field ${\mathbb F}_{2}$ and many of the results from finite fields can be used. 

Boolean functions are widely used in cryptography and cryptanalysis, as well as in many other areas. Particularly, Boolean functions are a core component in many stream ciphers, and any potential threat or attack to one of such models will often lead to the attacks to other models. For instance, they are used to develop pseudo-random generators in stream ciphers, S-boxes in block ciphers, and also to build error-correcting codes like Reed-Muller codes and Kerdock codes.

What exactly is a Boolean function?
Definition. A Boolean function $f$ in $n$ variables is a map from ${\mathbb F}_{2}^{n}$ to ${\mathbb F}_{2}$. The $(0,1)-sequence$ defined by $(f(x_0 ),f(x_1 ),… ,f(x_(2^n-1) )) $ is called the truth table of  $f$, where $x_0= (0,…,0,0),x_1  = (0,…,0,1),…,x_{2^n-1}  = (1,…,1,1)$, ordered by lexicographical order. The algebra of all Boolean functions on ${\mathbb F}_2^n$ will be denoted by $B_n$ (Stanica 2017).

The function $f:{\mathbb F}_2^n\rightarrow {\mathbb F}_2$ is called a Boolean function in $n$ variables, where ${\mathbb F}_2^n$ is the n-dimensional vector space over the binary field ${\mathbb F}_2$. The function $f$ is written as $f(x)=f(x_1,x_2,x_3,…,x_n )$, that is, $x$ is a vector $(x_1,x_2,x_3,…,x_n )$. The vector made of all the outputs of $f(x)$ is called the truth table of $f(x)$, which has dimension $2^n$.

The possible values in ${\mathbb F}_2^n$ for the binary vector $x$ can be ordered as the binary representation of an integer, in other words, $x=(x_1,x_2,x_3,…,x_n )=\sum_{i=1}^{n} x_i 2^{n-1}$, the integer goes from $0$ to $2^n-1$, making the corresponding vector goes through all the elements in ${\mathbb F}_2^n$. Note that Boolean variables $x=(x_1,x_2,x_3,…,x_n )$ can be treated as probabilistic variables which take random values from ${\mathbb F}_2^n$ equally likely with uniform probability distribution. As a result, each $x_i$ is an independent variable over $\{0,1\}$, and a Boolean function $f(x)$ is also treated as a function in random variables.

There are arithmetical operations on Boolean functions such as the addition of $f(x)\oplus g(x)$ which is the XOR operation of their corresponding outputs, and also the multiplication $f(x)g(x)$ which is the multiplication of their corresponding outputs.

The function $f(x)$ is called an affine function if there exist $a_0,a_1,…,a_n \in {\mathbb F}_2^n$ such that $f(x)=a_0\oplus a_1 x_1\oplus …\oplus a_n x_n$, where $\oplus$ means modulo $2$ addition. Particularly, if $a_0  = 0$, it is called a linear function. If the set of all Boolean functions in $n$ variables is denoted by $B_n$, the set of linear ones by $L_n$, and $A_n$ is the set of affine ones, then we say that $L_n\subset A_n \subset B_n$.

That's all for the time being, thanks for reading again!

Friday, 5 March 2021

Do You Want to Be a Cryptographer?

Alan Turing
Alan Turing
I've always been interested in information security since 2003, but it wasn't until I enrolled on the cryptography module — while studying for a Master in Advanced Computer Science in England in 2018— that I started getting more keen on the mathematical side of cryptography; perhaps it was in part because I cherish mathematics and had the right cryptography teacher. What's more, I admire Alan Turing since he strove to do good work in difficult conditions during World War II; his work saved millions of lives. All of these things together inspired me to immerse myself in the world of cryptography.