Thank you to everyone who played DownUnderCTF 2021. I hope you enjoyed the challenges :)

Here are writeups for the hard crypto challenges I wrote.

All of the challenge files and solve scripts can be found here.

Challenge	Tags	Solves
yadlp	`crypto`	14
power sign	`crypto`	14
1337crypt v2	`crypto`	3
Substitution Cipher III	`crypto`	1

yadlp

This challenge follows a similar structure to a commonly seen style of crypto CTF challenges that involve solving the discrete logarithm problem in some group other than the usual $(\mathbb{Z}/p\mathbb{Z})^\times$ . The general approach is to figure out some information about what the group elements are like, and then try to find an isomorphism from the group to a group where we can easily solve the discrete logarithm problem.

Group Structure Part I

We'll see that the group we are dealing with in the challenge is the set of points on some hyperbola over $\mathbb{F}_p$ . Although the operations defined in the handout code are called "addition" and "multiplication", it'll be helpful to use multiplication and exponentiation notation. When we say "elements", we usually mean points on the hyperbola, which are represented as tuples $(x, y) \in \mathbb{F}_p \times \mathbb{F}_p$ with addition operation defined as:

(x_1, y_1) \cdot (x_2, y_2) = (x_1 x_2 + D y_1 y_2, x_1 y_2 + x_2 y_1 + 2 y_1 y_2)

The rand_element function gives us even more information about elements in this group. In fact, we can use it to get an explicit relation that all elements satisfy. The function first generates a random $x \in \mathbb{F}_p$ , then computes $y$ as

y = \frac{x + \sqrt{x^2 (D+1) - D}}{D}

Rearranging, we get

\begin{aligned} (Dy - x)^2 &= x^2(D+1) - D \\ \implies D^2 y^2 - 2Dxy + x^2 &= Dx^2 + x^2 - D \\ \implies Dx^2 + 2Dxy - D^2 y^2 &= D \\ \implies x^2 + 2xy - Dy^2 &\equiv 1 \pmod p \qquad \text{since $D \neq 0$} \end{aligned}

It can be checked that the set of solutions to this equation, equipped with addition operation above indeed does form a group. We'll call this group $\mathcal{H}$ .

Diophantine equations of the form $x^2 - Dy^2 = 1$ are known as Pell equations and are well documented in the literature. We'll see later that the equation we found from the rand_element can be written as a Pell equation, and use that for analysing the group even more.

Challenge Analysis

For now we'll take a look at how the flag is hidden and what we'll need to recover it. The 48 byte flag is broken up into 6 blocks of 8 bytes each. These blocks are represented as the integers $m_1, \ldots, m_6$ . Then, for some randomly generated elements $g_1, \ldots, g_6$ , the value $c = g_1^{m_1} \cdots g_6^{m_6}$ is computed.

Given $c$ and the $g_i$ , the goal is to recover the $m_i$ . This is known as the t-multiple discrete logarithm problem, and in this case, we can solve it by solving the discrete logarithm problem in $\mathcal{H}$ , combined with lattice techniques.

Let $q$ be the order of $\mathcal{H}$ and let $g$ be a generator of $\mathcal{H}$ (we will see later that $\mathcal{H}$ is cyclic). Define the discrete logarithm to the base $g$ as $\log_g : \mathcal{H} \rightarrow \mathbb{Z}_q$ , that, for any $h \in \mathcal{H}$ , gives us a value $x = \log_g(h)$ such that $g^x = h$ . It can be shown that this map is a homomorphism, so $\log_g(h_1 h_2) = \log_g(h_1) + \log_g(h_2)$ . Therefore,

\begin{aligned} \log_g(c) &= \log_g(g_1^{m_1} \cdots g_6^{m_6}) \\ \log_g(c) &= \log_g(g_1^{m_1}) + \cdots + \log_g(g_6^{m_6}) \\ \log_g(c) &= m_1\log_g(g_1) + \cdots + m_6 \log_g(g_6) \\ \end{aligned}

Since we're working in $\mathbb{Z}_q$ , we can equivalently write

\log_g(c) \equiv m_1\log_g(g_1) + \cdots + m_6\log_g(g_6) \pmod q

But since the $m_i$ are relatively small (~62 bits) compared to $q$ (which we will determine later), we can solve for the $m_i$ using lattice techniques. Specifically, consider the lattice generated by the rows of the matrix

\begin{bmatrix} \log_g(g_1) & 1 \\ \log_g(g_2) & & 1 \\ \vdots & & & \ddots \\ \log_g(g_6) & & & & 1 \\ \log_g(c) & & & & & 1 \\ q \end{bmatrix}

Notice that $m_1$ times the first row, plus $m_2$ times the second row, etc. take $1$ times the seventh row, and an appropriate multiple of the last row, gives the short vector

(0, 2^{62}, 2^{62}, 2^{62}, 2^{62}, 2^{62}, 2^{62}, -1)

which can be found with lattice basis reduction algorithms such as LLL.

Group Structure Part II

Group Order

All we need left to solve the challenge is a way to solve the discrete logarithm problem in $\mathcal{H}$ . Determining the group order is a good first step as we could use it to try and see if there is an isomorphism between $\mathcal{H}$ and an additive group of integers of the same order where the discrete logarithm is trivially solved by division. However, it turns out that the order of $\mathcal{H}$ is $p+1$ and that $\mathcal{H}$ is isomorphic to the cyclic subgroup of $\mathbb{F}_{p^2}$ of order $p+1$ . We follow a similar approach to [1] for the proof.

Claim. Let $\mathcal{H} \subset \mathbb{F}_p \times \mathbb{F}_p$ be the set of solutions to the equation

x^2 + 2xy - Dy^2 \equiv 1 \pmod p

Assume that $\left(\dfrac{D+1}{p}\right) = -1$ (this is true for the challenge parameters). Then, $\mathcal{H} \cong S$ where $S \leq \mathbb{F}_{p^2}$ is the cyclic subgroup of $\mathbb{F}_{p^2}$ of order $p+1$ .

Proof. First, note that $(x+y)^2 - (D+1)y^2 = x^2 + 2xy - Dy^2$ , so it suffices to show that the set of solutions to the equation $(x+y)^2 - (D+1)y^2 \equiv 1 \pmod p$ (equipped with the addition operation as defined above) is isomorphic to $S$ . Let $f(W) = W^2 - (D+1)$ . Note that $f(W)$ is irreducible as it is of degree 2 and has no roots, since $D+1$ has no square in $\mathbb{F}_p$ by assumption. Therefore $\mathbb{F}_{p^2} \cong \mathbb{F}_p[W]/\langle f(W) \rangle$ . Now, if $\alpha \in S$ , then $\alpha^{p+1} = 1$ (since $S$ has order $p+1$ ). But we can write $\alpha = r + sW$ for $r,s \in \mathbb{F}_p$ . So,

\begin{aligned} \alpha^{p+1} &= (r + sW)^p (r + sW) \\ &= (r^p + s^pW^p)(r + sW) &&\quad (x+y)^p = x^p + y^p \text{ for all } x, y \in \mathbb{F}_p \\ &= (r + sW^p)(r + sW) &&\quad x^{p-1} = 1 \text{ for all } x \in \mathbb{F}_p \\ &= (r - sW)(r + sW) &&\quad W^p = W(W^2)^{\frac{p-1}{2}} = W(D+1)^{\frac{p-1}{2}} = -W \\ &= r^2 - s^2 W^2 \\ &= r^2 - (D+1)s^2 &&\quad W^2 = D+1 \end{aligned}

So, if we take $r = x+y$ and $s = y$ , then we see that $\alpha^{p+1} = 1 = (x+y)^2 - (D+1)y^2$ . Therefore, we have the bijection $\varphi : \mathcal{H} \rightarrow S, (x, y) \mapsto (x+y) + yW$ . To see that this is a homomorphism, let $(x_1, y_1), (x_2, y_2) \in \mathcal{H}$ . Then

\begin{aligned} \varphi((x_1, y_1) \cdot (x_2, y_2)) &= \varphi((x_1 x_2 + D y_1 y_2, x_1 y_2 + x_2 y_1 + 2 y_1 y_2)) \\ &= (x_1 x_2 + x_1 y_2 + x_2 y_1 + (D+2) y_1 y_2) + (x_1 y_2 + x_2 y_1 + 2 y_1 y_2) W \end{aligned}

But

\begin{aligned} \varphi((x_1, y_1)) \varphi((x_2, y_2)) &= ((x_1 + y_1) + y_1 W)((x_2 + y_2) + y_2 W) \\ &= (x_1 + y_1)(x_2 + y_2) + (x_1 + y_1) y_2 W + (x_2 + y_2) y_1 W + y_1 y_2 W^2 \\ &= (x_1 x_2 + x_1 y_2 + x_2 y_1 + y_1 y_2) + (x_1 y_2 + x_2 y_1 + 2 y_1 y_2) W + (D+1) y_1 y_2 \\ &= (x_1 x_2 + x_1 y_2 + x_2 y_1 + (D+2) y_1 y_2) + (x_1 y_2 + x_2 y_1 + 2 y_1 y_2) W \end{aligned}

So we have $\varphi((x_1, y_1) \cdot (x_2, y_2)) = \varphi((x_1, y_1)) \varphi((x_2, y_2))$ .

Therefore $\varphi$ is an isomorphism, and $\mathcal{H} \cong S$ .

If we take a look at the actual values in the challenge, we'll see that $p+1$ is smooth, with the largest factor being 32 bits! This is a hint that we're on the right path, since we'll be able to solve the discrete logarithm problem in a reasonable time with Pohlig-Hellman.

Inverse Elements

By this point, we've pretty much already solved the challenge, but as a bonus, we can find the inverse elements of $\mathcal{H}$ . Although the solution script doesn't do this, this has the advantage of speed over using the isomorphism found above since the operations are slightly faster (than operations in $\mathbb{F}_{p^2}$ ). Recall the addition formula

(x, y) \cdot (x', y') = (xx' + Dyy', xy' + x'y + 2yy')

The identity element is clearly $(1, 0)$ , so to find the inverse element of $(x, y)$ , we want to find $(x', y')$ such that $(x, y) \cdot (x', y') = (1, 0)$ . Let $(x', y') = (x + 2y, -y)$ . Then

\begin{aligned} (x, y) \cdot (x', y') &= (x, y) \cdot (x + 2y, -y) \\ &= (x(x + 2y) - Dy^2, -xy + (x + 2y)y - 2y^2) \\ &= (x^2 + 2xy - Dy^2, -xy + xy + 2y^2 - 2y^2) \\ &= (1, 0) \end{aligned}

So $(x, y)^{-1} = (x + 2y, -y)$ .

Another advantage of figuring out inverse elements is that you can solve the challenge without finding the isomorphism to $\mathbb{F}_{p^2}$ by guessing that the order is $p+1$ since it is smooth, and then implementing Pohlig-Hellman and BSGS to use the addition, multiplication and inverse operations.

References

[1] Elliptic Curve Public Key Cryptosystems - Menezes 93

power sign

Challenge Overview

We are given the code running on a server that implements a signature scheme. We can ask the server to sign one message, then we are challenged to forge a signature for a random message.

The signature scheme resembles the Rabin signature algorithm. The public key is an RSA modulus $N = pq$ whose prime factorisation is the private key. A message $M$ is signed by first computing a randomised hash $H(M, u)$ (where $u$ is random) of $M$ . If $H(M, u)$ has a square root modulo $N$ , a square root $x$ modulo $N$ is computed and the signature $(x, u)$ is outputted. Otherwise, we try again with a different $u$ . Note that computing square roots modulo $N$ , or even determining whether a number has a square root modulo $N$ is equivalent to factoring $N$ . To verify a signature $(x, u)$ for the message $M$ , we simply check that the equality $x^2 \equiv H(m, u) \pmod N$ holds.

In the challenge, the primes generated are of the form $p \equiv 3 \pmod 4$ . This is done because if $p$ is a prime such that $p \equiv 3 \pmod 4$ , then if a square root of $a \in \{ 0, \ldots, p-1 \}$ exists, it can be easily computed as $\pm a^{\frac{p+1}{4}} \mod p$ . To compute a square root of $c$ modulo $N$ , we compute square roots of $c$ modulo $p$ and modulo $q$ , then combine them with the Chinese Remainder Theorem.

Security of Rabin Signatures

Before we proceed, we'll take a look at an easy attack against a simplified variant of the Rabin signature algorithm. Specifically, we consider a variant that does not use a hash function at all, i.e. to sign a message $M$ the signer simply computes a square root of $M$ modulo $N$ . For simplicity, we assume that all the messages to be signed actually do have square roots modulo $N$ , though it does not really matter. There is one trivial attack; anyone can "forge" a valid signature for $M$ if $M$ is a perfect square. For example, if $M = 9$ , then $x = 3$ is a valid signature since $x^2 \equiv m \pmod N$ .

More interestingly however, it turns out that we can recover the private key given access to a signing oracle. We do this by choosing a random $x < N$ and ask the oracle to sign $x^2 \mod N$ . It returns a square root $y$ of $x^2$ modulo $N$ . Now, if $y \neq \pm x \pmod N$ , then we have

\begin{aligned} x^2 &\equiv y^2 \pmod N \\ \implies x^2 - y^2 &\equiv 0 \pmod N \\ \implies x^2 - y^2 &= kN \quad \text{for some $k$} \\ \implies (x-y)(x+y) &= kN \end{aligned}

so $\gcd(x-y, N)$ reveals a nontrivial factor of $N$ .

The Hash Function

This section and the next are somewhat algebra-heavy and basic results from algebra are used without proof for brevity. It may be worthwhile to read up on finite fields, field extensions and Galois theory (for later) if they are unfamiliar concepts.

We learned the importance of a good, randomised hash function in the previous section. And more importantly, we learned that a hash function which is simply the identity map (i.e. sends any input to itself), is completely insecure. We will now take a look at the hash function in the challenge.

Choose a composite integer $n$ and a proper divisor $m$ (in the challenge we have $n = 15, m = 3$ ). Let $r$ be the smallest prime number following $N$ , where $N$ is the signer's public key. We will work in an extension field $K = \mathbb{F}_{r^n} \cong \mathbb{F}_r[x]/(f)$ of $\mathbb{F}_r$ , where $f \in \mathbb{F}_r[x]$ is a public degree $n$ irreducible polynomial. Let $z = x + (f)$ . Then $z$ generates $K$ and $\{ 1, z, z^2, \ldots, z^{n-1} \}$ is basis for $K$ when viewed as an $n$ -dimensional $\mathbb{F}_r$ -vector space. That is, elements in $K$ can be written in the form $a_0 + a_1 z + \cdots + a_{n-1} z^{n-1}$ where $a_i \in \mathbb{F}_r$ .

The randomised hash function $H$ takes as input a message $M$ and an integer $u$ . Write $M$ in terms of powers of $r$ :

M = M_0 + M_1 r + M_2 r^2 + \cdots + M_k r^k

where $M_i < r$ . Then, $M$ is converted to an element $h$ in $K$ by computing

h = M_k + M_{k-1} z + M_{k-2} z^2 + \cdots + M_0 z^k

To obtain the hash, the function computes $(h + uz)^{r^m}$ which we write as

(h + uz)^{r^m} = a_0 + a_1 z + \cdots + a_{n-1} z^{n-1}

The output is $a_0$ .

Choosing a Message to be Signed

After the server provides us with its public key, it prompts us to send a message to be signed. Note that (in the sign function) $u$ isn't chosen randomly; it starts at $1$ and increments until $H(M, u)$ is a square. There is also a peculiar restriction on the message; it has to be larger than $N^m$ and smaller than $N^n$ .

If we were able to send messages of any size for the server to sign, we can easily find a message that will help us to recover the private key. Specifically, we would choose a random $s < N$ and send the message $(r - 1) + (s^2 \mod N) r$ . The hash function will convert this to $h = (s^2 \mod N) + (r - 1)z$ and output the constant term in $(h + uz)^{r^m}$ which happens to just be $s^2 \mod N$ since all elements $a \in \mathbb{F}_r$ satisfy $a^r = a$ by Fermat's Little Theorem. So when the server signs this, we have the exact same situation as the attack described two sections ago. However, the size check prevents this attack.

Recall that the identity map is insecure as a hash function for the reasons given two sections ago. It turns out that our particular hash function is the identity map on a specific subset, or rather, subfield of $K$ other than $\mathbb{F}_r$ . We have

H(M, u) = (h + uz)^{r^m}

where $h$ is the element in $K$ we obtain by converting $M$ . We can write this as a composition $H = f \circ g$ where

g(M, u) = h + uz \qquad f(x) = x^{r^m}

The goal will be to find fixed points of $H$ , which can be done by finding fixed points of $f$ since we can easily manipulate the result of $g(M, u)$ by carefully choosing $M$ . Fixed points of $f$ satisfy

f(x) = x \implies x^{r^m} = x \implies x^{r^m} - x = 0

Note that for a finite field $E$ of order $r^m$ , the elements of $E$ are given by the roots of $x^{r^m} - x$ . This follows from the fact that the multiplicative group $E^\times = E - \{ 0 \}$ is a cyclic group of order $r^m - 1$ , so if $\alpha \in E$ , then $\alpha^{r^m - 1} = 1$ and so $\alpha^{r^m} = \alpha$ . That $E^\times$ is cyclic of order $r^m - 1$ follows from the structure theorem for finite abelian groups which states that any finite abelian group is a direct product of cyclic groups. There are a lot of references online for these results and their proofs.

So, to solve for the roots of $x^{r^m} - x$ we simply need to look at elements in the finite field $E$ of order $r^m$ . Since $m$ divides $n$ , then this field is actually a subfield of $K$ because if $x$ satisfies $x^{r^m} = x$ , then it also satisfies

\begin{aligned} x^{r^n} &= x^{r^{km}} \\ &= x^{(r^m)^k} \\ &= x^{(r^m)(r^m)^{k-1}} \\ &= (x^{r^m})^{(r^m)^{k-1}} \\ &= x^{(r^m)^{k-1}} \\ &\quad \vdots \\ &= x \end{aligned}

(and it can also be checked that $E$ actually is a field). This is good for us as it means we can write the elements in terms of $z$ which is what the server will be expecting.

Let $z_E$ be a generator of $E$ . Because $E$ is a subfield of $K$ , then $z_E$ can be written as

z_E = e_0 + e_1 z + \cdots e_{n-1} z^{n-1}

where $e_i \in \mathbb{F}_r$ . Choose a random $s < N$ . We will want to find an element in $K$ of the form

(s^2 \mod N) + a_1 z + \cdots a_n z^{n-1}

such that when $H$ is applied to this element, the constant term remains as $s^2 \mod N$ which will be the output of the hash function. To do this, we will find an element in $E$ with $s^2 \mod N$ as its constant term, and then subtract $z$ from it to account for the randomising value $u$ which we can predict will be $1$ .

The element in $K$ we are interested in is obtained by computing

(s^2 \mod N) e_0^{-1} z_E - z = (s^2 \mod N) + a_1 z + \cdots + a_n z^{n-1}

To send this to the server, we encode it as an integer:

a_n + a_{n-1} r + \cdots + a_1 r^{n-2} + (s^2 \mod N) r^{n-1}

The server will compute for us a square root $y$ of $s^2 \mod N$ and if we have $y \neq \pm s$ , we can easily recover the private key using the technique described two sections ago.

Once we have the private key, we can use the provided functions in the handout code to sign the challenge message and capture the flag.

Easier Solution

I was made aware of this by S3v3ru5's solve during the CTF, but choosing the message to be signed can actually be quite simple (though fundamentally relies on most of the above theory); the idea is to shift the goal posts a bit and instead of trying to find a fixed message, we find a message whose hash is something we can control. We do this by noting that since $x^{r^n} = x$ for all $x \in K$ , then for any of our chosen $x \in K$ , if we send $x^{r^{n - m}} - zu$ , then after being hashed, the result is exactly $x$ . I imagine most teams would have solved this way instead of finding the subfield which is quite a bit more complicated. I obviously lacked the hindsight to spot this solution when writing the challenge, but it's pretty neat :)

Alternative Approach via Linearity Properties

This solution idea is due to rkm0959 who taught me this during the CTF after he solved it. Instead of looking at the fields involved, we can simply note that the hash function has some nice linearity properties. In particular

H(M, u) = H(M, 0) + u H(0, 1) \pmod r

So to forge a signature for any given $M$ , we simply let $x = 1$ and solve for $u$ :

\begin{aligned} x^2 &= H(M, u) \\ \implies x &= H(M, u) \qquad \text{since $x = 1$} \\ \implies x &= H(M, 0) + u H(0, 1) \\ \implies u &= H(0, 1)^{-1}(x - H(M, 0)) \end{aligned}

Then, $(x, u)$ is a valid signature for $M$ .

This attack doesn't need to use the signing oracle and shows that the signature scheme is completely broken when using this hash function. Pretty cool solution!

Alternative Approach via Galois Theory

Alternatively, one might recognise the resemblance of the function $f$ with the Frobenius map $\phi : K \rightarrow K, x \mapsto x^r$ which is an $\mathbb{F}_r$ -automorphism that generates the Galois group $G$ of $K/\mathbb{F}_r$ . Note that $G$ is cyclic and of order $n$ as $\phi^n : K \rightarrow K, x \mapsto x^{r^n}$ is the identity map. The Fundamental Theorem of Galois Theory tells us that there is a one-to-one correspondence between the subgroups of the Galois group of $K$ , and the intermediate fields of $K$ . Explicitly, for a given subgroup $H$ of $G$ , the corresponding intermediate field of $K$ is given by the fixed field $K^H$ , the set of all elements in $K$ which are fixed by all of the maps in $H$ . Another result, sometimes known as the Fixed Field Theorem, tells us that the order of $H$ is equal to the degree of $K$ as an extension of $K^H$ . We use this, along with the fact that the subfields of $K$ are given by $\mathbb{F}_{r^d}$ where $d$ divides $n$ , to find the fixed fields.

In the challenge, we have $n = 15$ , so the Galois group $G$ is isomorphic to $\mathbb{Z}/15\mathbb{Z}$ . The table below lists out the subgroups of $G$ and their corresponding intermediate fields ( $\mathrm{id}$ denotes the identity map):

Subgroup of $G$	Intermediate Field of $K$
$H_0 = \{ \mathrm{id} \}$	$K^{H_0} = K$ (all elements are fixed by $\mathrm{id}$ )
$H_1 = \{ \mathrm{id}, \phi^5, \phi^{10} \}$	$K^{H_1} = \mathbb{F}_{r^5}$
$H_2 = \{ \mathrm{id}, \phi^3, \phi^6, \phi^9, \phi^{12} \}$	$K^{H_2} = \mathbb{F}_{r^3}$
$G$	$K^G = \mathbb{F}_r$ (only $\mathbb{F}_r$ is fixed by all automorphisms)

From this, we can see that $\mathbb{F}_{r^3}$ is fixed by $\phi^3$ .

1337crypt v2

The challenge claims to be a more complex version of 1337crypt from last year's DUCTF. Reading the solution for 1337crypt may help a bit with some initial ideas as both challenges play with the idea of partial knowledge. In both challenges, we are given hints to recover the prime factors, but the hints themselves seem to omit some bits of information. Specifically, the hints are the integer parts of some values that should actually contain a fractional part as well. As we will see (and as you may suspect :)), lattice techniques will help us to recover this missing information and therefore the primes.

Flag Encryption

We start by looking at how the flag is encrypted so that we have a clear idea of what we need to recover it. The flag is encrypted in a way very similar to RSA. The only difference is that instead of using integers, we use Gaussian integers. If you aren't familiar with Gaussian integers, I highly recommend reading this excellent blog post by CryptoHack for some extra background.

We work in the ring $(\mathbb{Z}/n\mathbb{Z})[i]$ (we will just write this as $\mathbb{Z}_n[i]$ ) where $n = pq$ is the RSA modulus. The flag is encoded as an element $m$ , of $\mathbb{Z}_n[i]$ by choosing a random integer $r < n$ and computing $m = r + \mathrm{flag} \cdot i$ .

The ciphertext is obtained by raising $m$ to the $e$ th power. In the challenge, $e$ is 0x1337.

To decrypt the ciphertext, we must find the multiplicative inverse of $e$ modulo the order of $\mathbb{Z}_n[i]$ . (Note: in this context, when we say "the order of $\mathbb{Z}_n[i]$ " we mean the cardinality of the multiplicative group of $\mathbb{Z}_n[i]$ ) The order of $\mathbb{Z}_n[i]$ is $\varphi(n) = (p-1)(q-1)$ , so to compute it, we'll need to recover the primes.

Hints

The challenge generates 1337-bit primes $p$ and $q$ and gives us two hints, and the ciphertext. Using these hints, we need to recover $p$ and $q$ .

Hint 1

The first hint, which we will call $D$ , is $D = p^2 + q^2$ . This seems like a pretty big hint, and indeed if we were also given $n = pq$ we could easily recover the primes by finding the roots of a simple univariate polynomial. As far as I am aware, $D$ can't be used to directly recover $p$ and $q$ either; there are techniques such as factoring $p^2 + q^2 = (p + qi)(p - qi)$ over the Gaussian integers, but this requires factoring $D$ over the integers which may take a long time. That said, if you were able to find a solution using this hint alone I'd be interested in seeing it :)

Hint 2

There is a lot more going on in the second hint. But firstly, some background. Number field is just a fancy name for an (finite) extension of the rational numbers $\mathbb{Q}$ . The simplest number field is $\mathbb{Q}$ itself, and the rational complex numbers, $\mathbb{Q}(i)$ is also a number field.

We can see that a number field is constructed from $p$ and $q$ . In Sage, the first argument to NumberField is the defining polynomial. The roots of this polynomial specify the elements to be adjoined to $\mathbb{Q}$ to get the number field. We see that the defining polynomial for $K$ is $(x-p)^2 + q^2$ , which has roots $p \pm qi$ . Therefore, we define

K = \mathbb{Q}(p + qi)

(side note: we only need to adjoin one root since the other, it's conjugate, can be obtained from the root with usual field operations). In the handout code, $z = p + qi$ denotes this adjoined element.

Now, for the values we are given. We get two instances of values with the same form. For the $j$ th hint ( $j = 1, 2$ ), two 1337-bit numbers $a_j$ and $b_j$ are generated, as well as two $l$ -bit numbers $c_j$ and $d_j$ . Then, $x_j$ is computed as

x_j = (a_j + 2^{-l} c_j) + (b_j + 2^{-l} d_j) z

Lastly,

|x_j|^2 = x_j \overline{x_j}

is computed and we are given the three values $(\lfloor |x_j|^2 \rfloor, a_j, b_j)$ . Note that $\overline{x_j}$ denotes the complex conjugate of $x_j$ , so the value $x_j \overline{x_j}$ gives us the squared complex modulus of $x_j$ . For ease of reading, we will write $y$ instead of $\lfloor |x_j|^2 \rfloor$ .

Importantly, we note that $x_j$ is an element of $K$ and its components are rational numbers with fractional parts. Similarly, $|x_j|^2$ is a rational number with a fractional part. However, we are only given the integer parts of these values, so it seems like we might be missing some information.

Solution

Let's analyse the second hint in further detail and see if we can use it to recover the primes directly, or find a relation involving the primes that will help us to do so.

Analysing Hint 2

There will be some tedious algebra, so to make things a bit more readable we will drop the subscripts. Note that we use hint 1 here since $p^2 + q^2$ appears. We have

\begin{aligned} x &= (a + 2^{-l} c) + (b + 2^{-l} d) z \\ &= (a + 2^{-l} c) + (b + 2^{-l} d) (p + qi) \\ &= ((a + 2^{-l} c) + (b + 2^{-l} d) p) + (b + 2^{-l} d) qi \\ \implies |x|^2 &= ((a + 2^{-l} c) + (b + 2^{-l} d) p)^2 + ((b + 2^{-l} d)q)^2 \\ &= (a + 2^{-l} c)^2 + 2(a + 2^{-l} c)(b + 2^{-l} d) p \\ &\quad + (b + 2^{-l} d)^2 p^2 + (b + 2^{-l} d)^2 q^2 \\ &= (a + 2^{-l} c)^2 + 2(a + 2^{-l} c)(b + 2^{-l} d) p + (b + 2^{-l}d)^2 D \\ &= a^2 + 2 \cdot 2^{-l} ac + 2^{-2l} c^2 + 2 a b p + 2 \cdot 2^{-l} adp + 2 \cdot 2^{-l} bcp + 2 \cdot 2^{-2l} c d p \\ &\quad + (b^2 + 2 \cdot 2^{-l} bd + 2^{-2l} d^2) D \\ &= a^2 + 2 \cdot 2^{-l} ac + 2^{-2l} c^2 + 2 a b p + 2 \cdot 2^{-l} adp + 2 \cdot 2^{-l} bcp + 2 \cdot 2^{-2l} c d p \\ &\quad + b^2 D + 2 \cdot 2^{-l} bd D + 2^{-2l} d^2 D \end{aligned}

Now, this looks like a mess and it kinda is, but fortunately we can clean it up a bit. Recall that $a$ and $b$ are 1337-bit numbers, while $c$ and $d$ are 338-bit numbers ( $l = 338$ ). The following table gives the approximate size of each term (when considered as an integer), which we will use to reason with soon. Note that $D = p^2 + q^2$ is on the order of $2 \times 1337$ bits.

Term	Size (in bits)
$a^2$	$2 \times 1337$
$2 \cdot 2^{-l} ac$	$1337$
$2^{-2l}c^2$	$0$
$2abp$	$3 \times 1337$
$2 \cdot 2^{-l} adp$	$2 \times 1337$
$2 \cdot 2^{-l} bcp$	$2 \times 1337$
$2 \cdot 2^{-2l} cdp$	$1337$
$b^2 D$	$4 \times 1337$
$2 \cdot 2^{-l} bd D$	$3 \times 1337$
$2^{-2l} d^2 D$	$2 \times 1337$

Now, what we will do is divide (integer division) the entire expression by $2ab$ . Because $2ab$ is around $2 \times 1337$ bits in size, we can more or less throw away any term whose size is $2 \times 1337$ or less. This leaves us with

\begin{aligned} \frac{|x|^2}{2ab} = \frac{y}{2ab} &\approx p + \frac{(b^2 + 2 \cdot 2^{-l} bd) D}{2ab} \\ &\approx p + \frac{b^2D}{2ab} + \frac{2^{-l}dD}{a} \end{aligned}

This approximation itself is very accurate, and only differs by a few bits due to the $2 \times 1337$ terms. However, we don't know $d$ , so this equation isn't as useful for us. Instead, we write

\begin{aligned} \frac{y}{2ab} &\approx p + \frac{b^2 D}{2ab} + \left \lfloor \frac{2^{-l} D}{a} \right \rfloor d \\ &= p + \frac{b^2 D}{2ab} + \left \lfloor \frac{2^{-l} D}{a} \right \rfloor d + k \end{aligned}

where $|k| < 2^l$ . Note that the approximation is off by a term of size approximately $l$ bits because of integer division rounding. We will omit the $\lfloor \rfloor$ , but it should be understood that we are performing integer division.

Using Hint 2

Now let's bring back the subscripts, noting that we have two instances:

\begin{aligned} \begin{cases} \frac{y_1}{2a_1 b_1} &= p + \frac{b_1^2 D}{2 a_1 b_1} + \frac{2^{-l} D}{a_1} d_1 + k_1 \\ \frac{y_2}{2a_2 b_2} &= p + \frac{b_2^2 D}{2 a_2 b_2} + \frac{2^{-l} D}{a_2} d_2 + k_2 \\ \end{cases} \end{aligned}

Finally, let's combine these two equations to eliminate $p$ .

\frac{y_1 - b_1^2 D}{2 a_1 b_1} - \frac{2^{-l} D}{a_1} d_1 - k_1 = \frac{y_2 - b_2^2 D}{2 a_2 b_2}- \frac{2^{-l} D}{a_2} d_2 - k_2

Just so it's easier to read, let $t_j = \frac{y_j - b_j^2 D}{2 a_j b_j}$ and $s_j = \frac{2^{-l} D}{a_j}$ . Then, rewriting the above equation, we have:

t_1 - s_1 d_1 - k_1 = t_2 - s_2 d_2 - k_2

or to put it in a more exciting way:

f(d_1, d_2, k) = t_1 - s_1 d_1 - t_2 + s_2 d_2 - k = 0

Now, $f$ has "small" integer roots $d_1, d_2$ and $k$ . An algorithm for finding small roots of multivariate polyomials over the integers is described in this paper. Following a similar idea, defund's coppersmith implementation can also be used by working over an arbitrary ring $\mathbb{Z}/N\mathbb{Z}$ for some large $N$ .

Lattice Approach

However, we can also recover $d_1$ and $d_2$ with a very simple lattice. Consider the lattice generated by the rows of the following matrix:

\begin{bmatrix} s_1 & 1 & 0 \\ s_2 & 0 & 1 \\ t_1 - t_2 & 0 & 0 \end{bmatrix}

The short vector $(k, -d_1, d_2)$ is an element of this matrix, given by the linear combination of $-d_1$ times the first row, $d_2$ times the second row, and $1$ times the third row. LLL finds this vector.

Recovering the Primes

Now that we have $d$ , it is straightforward to recover $p$ . We use the approximation we found earlier:

\begin{aligned} \frac{y}{2ab} &\approx p + \frac{(b^2 + 2 \cdot 2^{-l} bd) D}{2ab} \\ \implies p &\approx \frac{y - (b^2 + 2 \cdot 2^{-l} bd) D}{2ab} \end{aligned}

This approximation is accurate to all but a few bits, which we can easily exhaust over, checking if $D - p^2$ is a square to know which candidate for $p$ is correct. We then compute $q$ as the square root of $D - p^2$ .

Getting the Flag

We've already discussed how to decrypt the ciphertext given that we have the prime factors. Compute $d \equiv e^{-1} \pmod{(p-1)(q-1)}$ as in regular RSA and raise $c$ to the $d$ th power. The flag is in the imaginary component of the result.

Substitution Cipher III

This challenge is a sequel to "Substitution Cipher II", though in terms of difficulty, it is a large step up. The basis of the challenge is Patarin's famous attack (though other attacks may work) on the Matsumoto-Imai cryptosystem which was hinted towards by the bold MI in the challenge's description (one could easily find useful resources online by searching "MI cryptosystem"). The large public key size is also a trait of multivariate public key cryptosystems which some people may have recognised.

Like with the other two substitution cipher challenges, we are given the code which has been used to encrypt the flag, as well as the ciphertext. In this challenge, we are also given a public key.

Matsumoto-Imai Cryptosystem

In this section we give a rough description of the Matsumoto-Imai-like cryptosystem as presented in the challenge. There are some differences from the original cryptosystem, so to hopefully make it easier to follow, we also draw comparisons between the mathematical notation and the handout code.

The general idea is that a composition of certain transformations is difficult to invert without knowledge of the individual transformations that make it up; so a private key is made up of some invertible transformations, and the corresponding public key is their composition. Additionally, solving a set of multivariate (even quadratic) polynomial equations over a finite field in general is proven to be NP-complete, so one shouldn't be able to easily recover the plaintext from a given ciphertext and the public key alone.

Parameters

To set the scene, we choose parameters $q$ and $n$ (in the challenge $q = 2$ and $n = 80$ ) and define some algebraic objects to work with:

\begin{aligned} K &= \mathbb{F}_q[x_1, \ldots, x_n] \\ E &= K[t]/(i(t)) \\ L &= E[x] \\ A &= \mathrm{Aff}(n, K) \end{aligned}

where $i(t)$ is an arbitrary degree $n$ irreducible polynomial.

In Sage, we need to write a bit more boilerplate to achieve this:

    q = 2
    F = GF(q)
    K = BooleanPolynomialRing(n, 'x')
    R.<t> = PolynomialRing(K)
    i = GF(q)[t].irreducible_element(n)
    I = Ideal(R(i))
    I.reduce = lambda f: f % i # dirty sage hack
    E.<tbar> = PolynomialRing(K, t).quo(I)
    L.<x> = PolynomialRing(E)
    A = AffineGroup(n, K)

BooleanPolynomialRing is a more efficient implementation of boolean polynomials; in most cases it can be used as a clean replacement for GF(2)[x1, ..., xn].
As for the I.reduce = ... line, this should probably not need to be done but I was having difficulties getting Sage to properly reduce elements of $E$ modulo $i(t)$ and this seemed like the only way to fix it.
The purpose of $L$ is to define polynomials whose variables are in $E$ (which we'll do next).
$A$ is defined so that we can easily work with affine transformations.

Private Key

Next, we'll generate the private key. We choose random $S, T \in A$ and $P = r x^{q + 1} \in L$ where $r \in E$ is randomly chosen. (Note that the construction of $P$ here is not in its most general form, and that this particular choice of $P$ is potentially weaker than other choices).

In Sage, we write:

    S = A(GL(n, F).random_element(), random_vector(F, n), check=False)
    r = sum(randint(0, 1) * tbar^i for i in range(n))
    P = r * x^(q + 1)
    T = A(GL(n, F).random_element(), random_vector(F, n), check=False)

A.random_element() wasn't working for some reason, so we simply generate a random invertible matrix and a random vector with elements in $\mathbb{F}_q$ which is enough to represent an affine transformation.

Public Key

To derive the public key $R$ from the private key, we simply take the composition $R = T \circ P \circ S$ .

Actually, that's not entirely correct but it captures the general idea. To be more explicit, $S$ and $T$ are affine transformations that map elements of $K^n$ to elements of $K^n$ , whereas $P$ is a function that takes an element in $E$ and outputs an element in $E$ . The way we glue together these components of different "types" is by introducing an extra function that converts between the two types; there is a natural bijection between $K^n$ and $E$ , so we define:

\begin{aligned} \varphi : K^n &\rightarrow E \\ (v_0, v_1, \ldots, v_{n-1}) &\mapsto v_0 + v_1 t + \cdots + v_{n-1} t^{n-1} \end{aligned}

So more accurately, we take the public key to be $R = T \circ \varphi^{-1} \circ P \circ \varphi \circ S$ .

$R$ itself can be represented by $n$ multivariate polynomials $r_1, r_2, \ldots, r_n$ ; these are what will be used for encryption.

In Sage, we can write this concisely as:

    B = S(K.gens()) * vector([tbar^i for i in range(n)])
    Q = P(B)
    R = T(Q.lift().coefficients())

Since $S$ and $T$ are elements of $A$ , we can perform the transformation they represent with S(x) and T(x) respectively, where x is an element of $K^n$ . Multiplication by the vector $(1, t, \ldots, t^{n-1})$ is equivalent to the conversion map $\varphi$ , and taking the coefficients of an element in $E$ is equivalent to $\varphi^{-1}$ . The reason for the .lift() is to consider Q as an element of $K[t]$ (as opposed to an element of $E$ ) since .coefficients() seems to be not implemented for this quotient ring.

Encryption

Encryption is easy but there is a slight caveat that might be confusing. The public polynomials $r_i : K^n \rightarrow K$ map elements of $K^n$ to an element of $K$ . We defined $K$ to be a boolean polynomial ring, but it should be noted that $\mathbb{F}_q$ is a subset of $K$ , and that $\mathbb{F}_q^n$ is a subset of $K^n$ . When we encrypt messages, we are really only concerned with the subset $\mathbb{F}_q$ and not with other elements of $K$ . (This is just for the reason that working with bits is easier; there shouldn't be any issues with encoding messages using other elements in $K$ )

To encrypt a message, we first encode the message as an element $m$ of $\mathbb{F}_q^n$ and compute $(r_1(m), r_2(m), \ldots, r_n(m))$ . where $r_i$ are the public polynomials. Since each of the $r_i$ sends elements of $\mathbb{F}_q^n$ to an element in $\mathbb{F}_q$ , the resulting ciphertext is an element of $\mathbb{F}_q^n \subset K^n$ .

Solution

We are given nothing but the public key and the ciphertexts. We could attempt to recover the private key, but it isn't necessary. We will follow Patarin's attack from Crypto'95 which allows us to recover the plaintext for any given ciphertext.

The attack relies on the fact that maps $f : E \rightarrow E$ of the form $x \mapsto x^{q^k}$ are linear for any integer $k$ . By this, we mean that if $z = z_0 + z_1 t + \cdots + z_{n-1} t^{n-1} \in E$ , then $f(z) = z_0' + z_1' t + \cdots + z_{n-1}' t^{n-1}$ is such that $z_i' = f_i(z_0, \ldots, z_{n-1})$ with $\deg f_i = 1$ . In words; this says that the coefficients of $f(z)$ can be written as a linear combination of the coefficients of $z$ .

I couldn't find a proof for this, (maybe because it's supposed to be obvious, but it certainly wasn't obvious to me), so here is my (possibly wrong) attempt at a justification (which I think might actually only work for $q = 2$ ):

Define $f : E \rightarrow E$ as $f(x) = x^{q^k}$ for any integer $k$ . Let $z = z_0 + z_1 t + \cdots + z_{n-1} t^{n-1} \in E$ . Then

\begin{aligned} f(z) &= (z_0 + z_1 t + \cdots + z_{n-1} t^{n-1})^{q^k} \\ &= z_0^{q^k} + (z_1 t)^{q^k} + \cdots + (z_{n-1} t^{n-1})^{q^k} \quad \text{since } \mathrm{char}(E) = q \\ &= z_0 + z_1 t^{q^k} + \cdots + z_{n-1} t^{(n-1)q^k} \quad \text{since } x = x^q \text{ for all } x \in K \end{aligned}

It should now follow that, after being reduced modulo the irreducible polynomial $i(t)$ , the coefficients of $f(z)$ are linear combinations of the $z_i$ .

Combined with some other clever insights, this eventually leads to a linear expression relating all plaintext and ciphertext pairs as we will soon see.

Finding the Relation

This is the core of the attack. To begin, recall that a plaintext message $x \in K^n$ is encrypted by computing its ciphertext $y = (T \circ \varphi^{-1} \circ P \circ \varphi \circ S)(x) \in K^n$ .

Let $a = \varphi(S(x))$ and $b = \varphi(T^{-1}(y))$ . Note that $a, b \in E$ and furthermore that $b = P(a)$ . So we have

b = ra^{q + 1}

Now, applying $g : E \rightarrow E, x \mapsto x^{q-1}$ to both sides of the equation, we get

\begin{aligned} g(b) &= g(ra^{q+1}) \\ b^{q-1} &= r^{q-1} a^{q^2 - 1} \end{aligned}

Multiplying both sides by $ab$ , we get

ab^q = r^{q-1} a^{q^2} b \tag{1}

By the definition of $a$ and $b$ , we have

a = c_1x + c_2 \qquad b = c_3 y + c_4

where the $c_i$ are elements of $E$ which we do not particularly care about. This follows because $a$ and $b$ are simply affine transformations of $x$ and $y$ respectively.

Combining this with the fact that the maps $b \mapsto b^q$ and $a \mapsto a^{q^2}$ are linear maps, equation $(1)$ effectively gives us an equation relating $x$ and $y$ where $x$ , $y$ and $xy$ appear to a power of no greater than $1$ .

So, if we let $x = (x_0, x_1, \ldots, x_{n-1})$ and $y = (y_0, y_1, \ldots, y_{n-1})$ , then the information given by equation $(1)$ can be rewritten as:

\sum_{i=0}^{n-1} \sum_{j=0}^{n-1} \gamma_{i, j} x_i y_j + \sum_{i=0}^{n-1} \alpha_i x_i + \sum_{j=0}^{n-1} \beta_j y_j + \delta = 0 \tag{2}

And the amazing thing about this relation is that it holds for all plaintext/ciphertext pairs $(x, y)$ since we made no assumptions about them to begin with! That means it holds for the flag and the ciphertext we've been given too :)

Recovering the Constants

Equation $(2)$ gives us a relation that holds for all plaintext/ciphertext pairs, but we don't know the $\gamma_{i, j}, \alpha_i, \beta_j$ and $\delta$ . Fortunately, since encryption uses the public key (which we have), we can generate our own plaintexts and encrypt them to get valid plaintext/ciphertext pairs.

With a set of plaintext/ciphertext pairs, we can view equation $(2)$ as a system of linear equations in the unknowns $\gamma_{i, j}, \alpha_i, \beta_j$ and $\delta$ and use linear algebra techniques to solve for them. There are $n^2 + n + n + 1 = (n+1)^2$ unknowns, so we'll need $(n+1)^2$ plaintext/ciphertext pairs to successfully recover the constants.

Generating the plaintext/ciphertext pairs may take a while (though, not that long if you're patient), so an easy optimisation we can do is choose sparse plaintext vectors; that is, plaintexts which have a lot of zero bits. Doing this speeds up polynomial evaluation which is the most expensive operation for encryption.

Recovering the Flag

The hard work is mostly done. If we have the relation and the constants, then to recover a plaintext given a ciphertext all we need to do is plug in the $y_j$ values, and solve for the $x_i$ values. Again, this is done using linear algebra techniques, but in this case instead of solving a system of linear equations in the unknowns $\gamma_{i, j}, \alpha_i, \beta_j$ and $\delta$ , we solve the system for the unknowns $x_0, x_1, \ldots, x_{n-1}$ .

yadlp

Group Structure Part I

Challenge Analysis

Group Structure Part II

Group Order

Inverse Elements

References

power sign

Challenge Overview

Security of Rabin Signatures

The Hash Function

Choosing a Message to be Signed

Easier Solution

Alternative Approach via Linearity Properties

Alternative Approach via Galois Theory

1337crypt v2

Flag Encryption

Hints

Hint 1

Hint 2

Solution

Analysing Hint 2

Using Hint 2

Lattice Approach

Recovering the Primes

Getting the Flag

Substitution Cipher III

Matsumoto-Imai Cryptosystem

Parameters

Private Key

Public Key

Encryption

Solution

Finding the Relation

Recovering the Constants

Recovering the Flag

References/Extra Reading