DownUnderCTF 2021 Writeups

Thank you to everyone who played DownUnderCTF 2021. I hope you enjoyed the challenges :)

Here are writeups for the hard crypto challenges I wrote.

All of the challenge files and solve scripts can be found here.

ChallengeTagsSolves
yadlpcrypto14
power signcrypto14
1337crypt v2crypto3
Substitution Cipher IIIcrypto1

This challenge follows a similar structure to a commonly seen style of crypto CTF challenges that involve solving the discrete logarithm problem in some group other than the usual $(\mathbb{Z}/p\mathbb{Z})^\times$. The general approach is to figure out some information about what the group elements are like, and then try to find an isomorphism from the group to a group where we can easily solve the discrete logarithm problem.

## Group Structure Part I

We'll see that the group we are dealing with in the challenge is the set of points on some hyperbola over $\mathbb{F}_p$. Although the operations defined in the handout code are called "addition" and "multiplication", it'll be helpful to use multiplication and exponentiation notation. When we say "elements", we usually mean points on the hyperbola, which are represented as tuples $(x, y) \in \mathbb{F}_p \times \mathbb{F}_p$ with addition operation defined as:

$(x_1, y_1) \cdot (x_2, y_2) = (x_1 x_2 + D y_1 y_2, x_1 y_2 + x_2 y_1 + 2 y_1 y_2)$

The rand_element function gives us even more information about elements in this group. In fact, we can use it to get an explicit relation that all elements satisfy. The function first generates a random $x \in \mathbb{F}_p$, then computes $y$ as

$y = \frac{x + \sqrt{x^2 (D+1) - D}}{D}$

Rearranging, we get

\begin{aligned} (Dy - x)^2 &= x^2(D+1) - D \\ \implies D^2 y^2 - 2Dxy + x^2 &= Dx^2 + x^2 - D \\ \implies Dx^2 + 2Dxy - D^2 y^2 &= D \\ \implies x^2 + 2xy - Dy^2 &\equiv 1 \pmod p \qquad \text{since D \neq 0} \end{aligned}

It can be checked that the set of solutions to this equation, equipped with addition operation above indeed does form a group. We'll call this group $\mathcal{H}$.

Diophantine equations of the form $x^2 - Dy^2 = 1$ are known as Pell equations and are well documented in the literature. We'll see later that the equation we found from the rand_element can be written as a Pell equation, and use that for analysing the group even more.

## Challenge Analysis

For now we'll take a look at how the flag is hidden and what we'll need to recover it. The 48 byte flag is broken up into 6 blocks of 8 bytes each. These blocks are represented as the integers $m_1, \ldots, m_6$. Then, for some randomly generated elements $g_1, \ldots, g_6$, the value $c = g_1^{m_1} \cdots g_6^{m_6}$ is computed.

Given $c$ and the $g_i$, the goal is to recover the $m_i$. This is known as the t-multiple discrete logarithm problem, and in this case, we can solve it by solving the discrete logarithm problem in $\mathcal{H}$, combined with lattice techniques.

Let $q$ be the order of $\mathcal{H}$ and let $g$ be a generator of $\mathcal{H}$ (we will see later that $\mathcal{H}$ is cyclic). Define the discrete logarithm to the base $g$ as $\log_g : \mathcal{H} \rightarrow \mathbb{Z}_q$, that, for any $h \in \mathcal{H}$, gives us a value $x = \log_g(h)$ such that $g^x = h$. It can be shown that this map is a homomorphism, so $\log_g(h_1 h_2) = \log_g(h_1) + \log_g(h_2)$. Therefore,

\begin{aligned} \log_g(c) &= \log_g(g_1^{m_1} \cdots g_6^{m_6}) \\ \log_g(c) &= \log_g(g_1^{m_1}) + \cdots + \log_g(g_6^{m_6}) \\ \log_g(c) &= m_1\log_g(g_1) + \cdots + m_6 \log_g(g_6) \\ \end{aligned}

Since we're working in $\mathbb{Z}_q$, we can equivalently write

$\log_g(c) \equiv m_1\log_g(g_1) + \cdots + m_6\log_g(g_6) \pmod q$

But since the $m_i$ are relatively small (~62 bits) compared to $q$ (which we will determine later), we can solve for the $m_i$ using lattice techniques. Specifically, consider the lattice generated by the rows of the matrix

$\begin{bmatrix} \log_g(g_1) & 1 \\ \log_g(g_2) & & 1 \\ \vdots & & & \ddots \\ \log_g(g_6) & & & & 1 \\ \log_g(c) & & & & & 1 \\ q \end{bmatrix}$

Notice that $m_1$ times the first row, plus $m_2$ times the second row, etc. take $1$ times the seventh row, and an appropriate multiple of the last row, gives the short vector

$(0, 2^{62}, 2^{62}, 2^{62}, 2^{62}, 2^{62}, 2^{62}, -1)$

which can be found with lattice basis reduction algorithms such as LLL.

## Group Structure Part II

### Group Order

All we need left to solve the challenge is a way to solve the discrete logarithm problem in $\mathcal{H}$. Determining the group order is a good first step as we could use it to try and see if there is an isomorphism between $\mathcal{H}$ and an additive group of integers of the same order where the discrete logarithm is trivially solved by division. However, it turns out that the order of $\mathcal{H}$ is $p+1$ and that $\mathcal{H}$ is isomorphic to the cyclic subgroup of $\mathbb{F}_{p^2}$ of order $p+1$. We follow a similar approach to [1] for the proof.

Claim. Let $\mathcal{H} \subset \mathbb{F}_p \times \mathbb{F}_p$ be the set of solutions to the equation

$x^2 + 2xy - Dy^2 \equiv 1 \pmod p$

Assume that $\left(\dfrac{D+1}{p}\right) = -1$ (this is true for the challenge parameters). Then, $\mathcal{H} \cong S$ where $S \leq \mathbb{F}_{p^2}$ is the cyclic subgroup of $\mathbb{F}_{p^2}$ of order $p+1$.

Proof. First, note that $(x+y)^2 - (D+1)y^2 = x^2 + 2xy - Dy^2$, so it suffices to show that the set of solutions to the equation $(x+y)^2 - (D+1)y^2 \equiv 1 \pmod p$ (equipped with the addition operation as defined above) is isomorphic to $S$. Let $f(W) = W^2 - (D+1)$. Note that $f(W)$ is irreducible as it is of degree 2 and has no roots, since $D+1$ has no square in $\mathbb{F}_p$ by assumption. Therefore $\mathbb{F}_{p^2} \cong \mathbb{F}_p[W]/\langle f(W) \rangle$. Now, if $\alpha \in S$, then $\alpha^{p+1} = 1$ (since $S$ has order $p+1$). But we can write $\alpha = r + sW$ for $r,s \in \mathbb{F}_p$. So,

\begin{aligned} \alpha^{p+1} &= (r + sW)^p (r + sW) \\ &= (r^p + s^pW^p)(r + sW) &&\quad (x+y)^p = x^p + y^p \text{ for all } x, y \in \mathbb{F}_p \\ &= (r + sW^p)(r + sW) &&\quad x^{p-1} = 1 \text{ for all } x \in \mathbb{F}_p \\ &= (r - sW)(r + sW) &&\quad W^p = W(W^2)^{\frac{p-1}{2}} = W(D+1)^{\frac{p-1}{2}} = -W \\ &= r^2 - s^2 W^2 \\ &= r^2 - (D+1)s^2 &&\quad W^2 = D+1 \end{aligned}

So, if we take $r = x+y$ and $s = y$, then we see that $\alpha^{p+1} = 1 = (x+y)^2 - (D+1)y^2$. Therefore, we have the bijection $\varphi : \mathcal{H} \rightarrow S, (x, y) \mapsto (x+y) + yW$. To see that this is a homomorphism, let $(x_1, y_1), (x_2, y_2) \in \mathcal{H}$. Then

\begin{aligned} \varphi((x_1, y_1) \cdot (x_2, y_2)) &= \varphi((x_1 x_2 + D y_1 y_2, x_1 y_2 + x_2 y_1 + 2 y_1 y_2)) \\ &= (x_1 x_2 + x_1 y_2 + x_2 y_1 + (D+2) y_1 y_2) + (x_1 y_2 + x_2 y_1 + 2 y_1 y_2) W \end{aligned}

But

\begin{aligned} \varphi((x_1, y_1)) \varphi((x_2, y_2)) &= ((x_1 + y_1) + y_1 W)((x_2 + y_2) + y_2 W) \\ &= (x_1 + y_1)(x_2 + y_2) + (x_1 + y_1) y_2 W + (x_2 + y_2) y_1 W + y_1 y_2 W^2 \\ &= (x_1 x_2 + x_1 y_2 + x_2 y_1 + y_1 y_2) + (x_1 y_2 + x_2 y_1 + 2 y_1 y_2) W + (D+1) y_1 y_2 \\ &= (x_1 x_2 + x_1 y_2 + x_2 y_1 + (D+2) y_1 y_2) + (x_1 y_2 + x_2 y_1 + 2 y_1 y_2) W \end{aligned}

So we have $\varphi((x_1, y_1) \cdot (x_2, y_2)) = \varphi((x_1, y_1)) \varphi((x_2, y_2))$.

Therefore $\varphi$ is an isomorphism, and $\mathcal{H} \cong S$.

If we take a look at the actual values in the challenge, we'll see that $p+1$ is smooth, with the largest factor being 32 bits! This is a hint that we're on the right path, since we'll be able to solve the discrete logarithm problem in a reasonable time with Pohlig-Hellman.

### Inverse Elements

By this point, we've pretty much already solved the challenge, but as a bonus, we can find the inverse elements of $\mathcal{H}$. Although the solution script doesn't do this, this has the advantage of speed over using the isomorphism found above since the operations are slightly faster (than operations in $\mathbb{F}_{p^2}$). Recall the addition formula

$(x, y) \cdot (x', y') = (xx' + Dyy', xy' + x'y + 2yy')$

The identity element is clearly $(1, 0)$, so to find the inverse element of $(x, y)$, we want to find $(x', y')$ such that $(x, y) \cdot (x', y') = (1, 0)$. Let $(x', y') = (x + 2y, -y)$. Then

\begin{aligned} (x, y) \cdot (x', y') &= (x, y) \cdot (x + 2y, -y) \\ &= (x(x + 2y) - Dy^2, -xy + (x + 2y)y - 2y^2) \\ &= (x^2 + 2xy - Dy^2, -xy + xy + 2y^2 - 2y^2) \\ &= (1, 0) \end{aligned}

So $(x, y)^{-1} = (x + 2y, -y)$.

Another advantage of figuring out inverse elements is that you can solve the challenge without finding the isomorphism to $\mathbb{F}_{p^2}$ by guessing that the order is $p+1$ since it is smooth, and then implementing Pohlig-Hellman and BSGS to use the addition, multiplication and inverse operations.

# power sign

## Challenge Overview

We are given the code running on a server that implements a signature scheme. We can ask the server to sign one message, then we are challenged to forge a signature for a random message.

The signature scheme resembles the Rabin signature algorithm. The public key is an RSA modulus $N = pq$ whose prime factorisation is the private key. A message $M$ is signed by first computing a randomised hash $H(M, u)$ (where $u$ is random) of $M$. If $H(M, u)$ has a square root modulo $N$, a square root $x$ modulo $N$ is computed and the signature $(x, u)$ is outputted. Otherwise, we try again with a different $u$. Note that computing square roots modulo $N$, or even determining whether a number has a square root modulo $N$ is equivalent to factoring $N$. To verify a signature $(x, u)$ for the message $M$, we simply check that the equality $x^2 \equiv H(m, u) \pmod N$ holds.

In the challenge, the primes generated are of the form $p \equiv 3 \pmod 4$. This is done because if $p$ is a prime such that $p \equiv 3 \pmod 4$, then if a square root of $a \in \{ 0, \ldots, p-1 \}$ exists, it can be easily computed as $\pm a^{\frac{p+1}{4}} \mod p$. To compute a square root of $c$ modulo $N$, we compute square roots of $c$ modulo $p$ and modulo $q$, then combine them with the Chinese Remainder Theorem.

## Security of Rabin Signatures

Before we proceed, we'll take a look at an easy attack against a simplified variant of the Rabin signature algorithm. Specifically, we consider a variant that does not use a hash function at all, i.e. to sign a message $M$ the signer simply computes a square root of $M$ modulo $N$. For simplicity, we assume that all the messages to be signed actually do have square roots modulo $N$, though it does not really matter. There is one trivial attack; anyone can "forge" a valid signature for $M$ if $M$ is a perfect square. For example, if $M = 9$, then $x = 3$ is a valid signature since $x^2 \equiv m \pmod N$.

More interestingly however, it turns out that we can recover the private key given access to a signing oracle. We do this by choosing a random $x < N$ and ask the oracle to sign $x^2 \mod N$. It returns a square root $y$ of $x^2$ modulo $N$. Now, if $y \neq \pm x \pmod N$, then we have

\begin{aligned} x^2 &\equiv y^2 \pmod N \\ \implies x^2 - y^2 &\equiv 0 \pmod N \\ \implies x^2 - y^2 &= kN \quad \text{for some k} \\ \implies (x-y)(x+y) &= kN \end{aligned}

so $\gcd(x-y, N)$ reveals a nontrivial factor of $N$.

## The Hash Function

This section and the next are somewhat algebra-heavy and basic results from algebra are used without proof for brevity. It may be worthwhile to read up on finite fields, field extensions and Galois theory (for later) if they are unfamiliar concepts.

We learned the importance of a good, randomised hash function in the previous section. And more importantly, we learned that a hash function which is simply the identity map (i.e. sends any input to itself), is completely insecure. We will now take a look at the hash function in the challenge.

Choose a composite integer $n$ and a proper divisor $m$ (in the challenge we have $n = 15, m = 3$). Let $r$ be the smallest prime number following $N$, where $N$ is the signer's public key. We will work in an extension field $K = \mathbb{F}_{r^n} \cong \mathbb{F}_r[x]/(f)$ of $\mathbb{F}_r$, where $f \in \mathbb{F}_r[x]$ is a public degree $n$ irreducible polynomial. Let $z = x + (f)$. Then $z$ generates $K$ and $\{ 1, z, z^2, \ldots, z^{n-1} \}$ is basis for $K$ when viewed as an $n$-dimensional $\mathbb{F}_r$-vector space. That is, elements in $K$ can be written in the form $a_0 + a_1 z + \cdots + a_{n-1} z^{n-1}$ where $a_i \in \mathbb{F}_r$.

The randomised hash function $H$ takes as input a message $M$ and an integer $u$. Write $M$ in terms of powers of $r$:

$M = M_0 + M_1 r + M_2 r^2 + \cdots + M_k r^k$

where $M_i < r$. Then, $M$ is converted to an element $h$ in $K$ by computing

$h = M_k + M_{k-1} z + M_{k-2} z^2 + \cdots + M_0 z^k$

To obtain the hash, the function computes $(h + uz)^{r^m}$ which we write as

$(h + uz)^{r^m} = a_0 + a_1 z + \cdots + a_{n-1} z^{n-1}$

The output is $a_0$.

## Choosing a Message to be Signed

After the server provides us with its public key, it prompts us to send a message to be signed. Note that (in the sign function) $u$ isn't chosen randomly; it starts at $1$ and increments until $H(M, u)$ is a square. There is also a peculiar restriction on the message; it has to be larger than $N^m$ and smaller than $N^n$.

If we were able to send messages of any size for the server to sign, we can easily find a message that will help us to recover the private key. Specifically, we would choose a random $s < N$ and send the message $(r - 1) + (s^2 \mod N) r$. The hash function will convert this to $h = (s^2 \mod N) + (r - 1)z$ and output the constant term in $(h + uz)^{r^m}$ which happens to just be $s^2 \mod N$ since all elements $a \in \mathbb{F}_r$ satisfy $a^r = a$ by Fermat's Little Theorem. So when the server signs this, we have the exact same situation as the attack described two sections ago. However, the size check prevents this attack.

Recall that the identity map is insecure as a hash function for the reasons given two sections ago. It turns out that our particular hash function is the identity map on a specific subset, or rather, subfield of $K$ other than $\mathbb{F}_r$. We have

$H(M, u) = (h + uz)^{r^m}$

where $h$ is the element in $K$ we obtain by converting $M$. We can write this as a composition $H = f \circ g$ where

$g(M, u) = h + uz \qquad f(x) = x^{r^m}$

The goal will be to find fixed points of $H$, which can be done by finding fixed points of $f$ since we can easily manipulate the result of $g(M, u)$ by carefully choosing $M$. Fixed points of $f$ satisfy

$f(x) = x \implies x^{r^m} = x \implies x^{r^m} - x = 0$

Note that for a finite field $E$ of order $r^m$, the elements of $E$ are given by the roots of $x^{r^m} - x$. This follows from the fact that the multiplicative group $E^\times = E - \{ 0 \}$ is a cyclic group of order $r^m - 1$, so if $\alpha \in E$, then $\alpha^{r^m - 1} = 1$ and so $\alpha^{r^m} = \alpha$. That $E^\times$ is cyclic of order $r^m - 1$ follows from the structure theorem for finite abelian groups which states that any finite abelian group is a direct product of cyclic groups. There are a lot of references online for these results and their proofs.

So, to solve for the roots of $x^{r^m} - x$ we simply need to look at elements in the finite field $E$ of order $r^m$. Since $m$ divides $n$, then this field is actually a subfield of $K$ because if $x$ satisfies $x^{r^m} = x$, then it also satisfies

\begin{aligned} x^{r^n} &= x^{r^{km}} \\ &= x^{(r^m)^k} \\ &= x^{(r^m)(r^m)^{k-1}} \\ &= (x^{r^m})^{(r^m)^{k-1}} \\ &= x^{(r^m)^{k-1}} \\ &\quad \vdots \\ &= x \end{aligned}

(and it can also be checked that $E$ actually is a field). This is good for us as it means we can write the elements in terms of $z$ which is what the server will be expecting.

Let $z_E$ be a generator of $E$. Because $E$ is a subfield of $K$, then $z_E$ can be written as

$z_E = e_0 + e_1 z + \cdots e_{n-1} z^{n-1}$

where $e_i \in \mathbb{F}_r$. Choose a random $s < N$. We will want to find an element in $K$ of the form

$(s^2 \mod N) + a_1 z + \cdots a_n z^{n-1}$

such that when $H$ is applied to this element, the constant term remains as $s^2 \mod N$ which will be the output of the hash function. To do this, we will find an element in $E$ with $s^2 \mod N$ as its constant term, and then subtract $z$ from it to account for the randomising value $u$ which we can predict will be $1$.

The element in $K$ we are interested in is obtained by computing

$(s^2 \mod N) e_0^{-1} z_E - z = (s^2 \mod N) + a_1 z + \cdots + a_n z^{n-1}$

To send this to the server, we encode it as an integer:

$a_n + a_{n-1} r + \cdots + a_1 r^{n-2} + (s^2 \mod N) r^{n-1}$

The server will compute for us a square root $y$ of $s^2 \mod N$ and if we have $y \neq \pm s$, we can easily recover the private key using the technique described two sections ago.

Once we have the private key, we can use the provided functions in the handout code to sign the challenge message and capture the flag.

### Easier Solution

I was made aware of this by S3v3ru5's solve during the CTF, but choosing the message to be signed can actually be quite simple (though fundamentally relies on most of the above theory); the idea is to shift the goal posts a bit and instead of trying to find a fixed message, we find a message whose hash is something we can control. We do this by noting that since $x^{r^n} = x$ for all $x \in K$, then for any of our chosen $x \in K$, if we send $x^{r^{n - m}} - zu$, then after being hashed, the result is exactly $x$. I imagine most teams would have solved this way instead of finding the subfield which is quite a bit more complicated. I obviously lacked the hindsight to spot this solution when writing the challenge, but it's pretty neat :)

## Alternative Approach via Linearity Properties

This solution idea is due to rkm0959 who taught me this during the CTF after he solved it. Instead of looking at the fields involved, we can simply note that the hash function has some nice linearity properties. In particular

$H(M, u) = H(M, 0) + u H(0, 1) \pmod r$

So to forge a signature for any given $M$, we simply let $x = 1$ and solve for $u$:

\begin{aligned} x^2 &= H(M, u) \\ \implies x &= H(M, u) \qquad \text{since x = 1} \\ \implies x &= H(M, 0) + u H(0, 1) \\ \implies u &= H(0, 1)^{-1}(x - H(M, 0)) \end{aligned}

Then, $(x, u)$ is a valid signature for $M$.

This attack doesn't need to use the signing oracle and shows that the signature scheme is completely broken when using this hash function. Pretty cool solution!

## Alternative Approach via Galois Theory

Alternatively, one might recognise the resemblance of the function $f$ with the Frobenius map $\phi : K \rightarrow K, x \mapsto x^r$ which is an $\mathbb{F}_r$-automorphism that generates the Galois group $G$ of $K/\mathbb{F}_r$. Note that $G$ is cyclic and of order $n$ as $\phi^n : K \rightarrow K, x \mapsto x^{r^n}$ is the identity map. The Fundamental Theorem of Galois Theory tells us that there is a one-to-one correspondence between the subgroups of the Galois group of $K$, and the intermediate fields of $K$. Explicitly, for a given subgroup $H$ of $G$, the corresponding intermediate field of $K$ is given by the fixed field $K^H$, the set of all elements in $K$ which are fixed by all of the maps in $H$. Another result, sometimes known as the Fixed Field Theorem, tells us that the order of $H$ is equal to the degree of $K$ as an extension of $K^H$. We use this, along with the fact that the subfields of $K$ are given by $\mathbb{F}_{r^d}$ where $d$ divides $n$, to find the fixed fields.

In the challenge, we have $n = 15$, so the Galois group $G$ is isomorphic to $\mathbb{Z}/15\mathbb{Z}$. The table below lists out the subgroups of $G$ and their corresponding intermediate fields ($\mathrm{id}$ denotes the identity map):

Subgroup of $G$Intermediate Field of $K$
$H_0 = \{ \mathrm{id} \}$$K^{H_0} = K$ (all elements are fixed by $\mathrm{id}$)
$H_1 = \{ \mathrm{id}, \phi^5, \phi^{10} \}$$K^{H_1} = \mathbb{F}_{r^5}$
$H_2 = \{ \mathrm{id}, \phi^3, \phi^6, \phi^9, \phi^{12} \}$$K^{H_2} = \mathbb{F}_{r^3}$
$G$$K^G = \mathbb{F}_r$ (only $\mathbb{F}_r$ is fixed by all automorphisms)

From this, we can see that $\mathbb{F}_{r^3}$ is fixed by $\phi^3$.

# 1337crypt v2

The challenge claims to be a more complex version of 1337crypt from last year's DUCTF. Reading the solution for 1337crypt may help a bit with some initial ideas as both challenges play with the idea of partial knowledge. In both challenges, we are given hints to recover the prime factors, but the hints themselves seem to omit some bits of information. Specifically, the hints are the integer parts of some values that should actually contain a fractional part as well. As we will see (and as you may suspect :)), lattice techniques will help us to recover this missing information and therefore the primes.

## Flag Encryption

We start by looking at how the flag is encrypted so that we have a clear idea of what we need to recover it. The flag is encrypted in a way very similar to RSA. The only difference is that instead of using integers, we use Gaussian integers. If you aren't familiar with Gaussian integers, I highly recommend reading this excellent blog post by CryptoHack for some extra background.

We work in the ring $(\mathbb{Z}/n\mathbb{Z})[i]$ (we will just write this as $\mathbb{Z}_n[i]$) where $n = pq$ is the RSA modulus. The flag is encoded as an element $m$, of $\mathbb{Z}_n[i]$ by choosing a random integer $r < n$ and computing $m = r + \mathrm{flag} \cdot i$.

The ciphertext is obtained by raising $m$ to the $e$th power. In the challenge, $e$ is 0x1337.

To decrypt the ciphertext, we must find the multiplicative inverse of $e$ modulo the order of $\mathbb{Z}_n[i]$. (Note: in this context, when we say "the order of $\mathbb{Z}_n[i]$" we mean the cardinality of the multiplicative group of $\mathbb{Z}_n[i]$) The order of $\mathbb{Z}_n[i]$ is $\varphi(n) = (p-1)(q-1)$, so to compute it, we'll need to recover the primes.

## Hints

The challenge generates 1337-bit primes $p$ and $q$ and gives us two hints, and the ciphertext. Using these hints, we need to recover $p$ and $q$.

### Hint 1

The first hint, which we will call $D$, is $D = p^2 + q^2$. This seems like a pretty big hint, and indeed if we were also given $n = pq$ we could easily recover the primes by finding the roots of a simple univariate polynomial. As far as I am aware, $D$ can't be used to directly recover $p$ and $q$ either; there are techniques such as factoring $p^2 + q^2 = (p + qi)(p - qi)$ over the Gaussian integers, but this requires factoring $D$ over the integers which may take a long time. That said, if you were able to find a solution using this hint alone I'd be interested in seeing it :)

### Hint 2

There is a lot more going on in the second hint. But firstly, some background. Number field is just a fancy name for an (finite) extension of the rational numbers $\mathbb{Q}$. The simplest number field is $\mathbb{Q}$ itself, and the rational complex numbers, $\mathbb{Q}(i)$ is also a number field.

We can see that a number field is constructed from $p$ and $q$. In Sage, the first argument to NumberField is the defining polynomial. The roots of this polynomial specify the elements to be adjoined to $\mathbb{Q}$ to get the number field. We see that the defining polynomial for $K$ is $(x-p)^2 + q^2$, which has roots $p \pm qi$. Therefore, we define

$K = \mathbb{Q}(p + qi)$

(side note: we only need to adjoin one root since the other, it's conjugate, can be obtained from the root with usual field operations). In the handout code, $z = p + qi$ denotes this adjoined element.

Now, for the values we are given. We get two instances of values with the same form. For the $j$th hint ($j = 1, 2$), two 1337-bit numbers $a_j$ and $b_j$ are generated, as well as two $l$-bit numbers $c_j$ and $d_j$. Then, $x_j$ is computed as

$x_j = (a_j + 2^{-l} c_j) + (b_j + 2^{-l} d_j) z$

Lastly,

$|x_j|^2 = x_j \overline{x_j}$

is computed and we are given the three values $(\lfloor |x_j|^2 \rfloor, a_j, b_j)$. Note that $\overline{x_j}$ denotes the complex conjugate of $x_j$, so the value $x_j \overline{x_j}$ gives us the squared complex modulus of $x_j$. For ease of reading, we will write $y$ instead of $\lfloor |x_j|^2 \rfloor$.

Importantly, we note that $x_j$ is an element of $K$ and its components are rational numbers with fractional parts. Similarly, $|x_j|^2$ is a rational number with a fractional part. However, we are only given the integer parts of these values, so it seems like we might be missing some information.

## Solution

Let's analyse the second hint in further detail and see if we can use it to recover the primes directly, or find a relation involving the primes that will help us to do so.

### Analysing Hint 2

There will be some tedious algebra, so to make things a bit more readable we will drop the subscripts. Note that we use hint 1 here since $p^2 + q^2$ appears. We have

\begin{aligned} x &= (a + 2^{-l} c) + (b + 2^{-l} d) z \\ &= (a + 2^{-l} c) + (b + 2^{-l} d) (p + qi) \\ &= ((a + 2^{-l} c) + (b + 2^{-l} d) p) + (b + 2^{-l} d) qi \\ \implies |x|^2 &= ((a + 2^{-l} c) + (b + 2^{-l} d) p)^2 + ((b + 2^{-l} d)q)^2 \\ &= (a + 2^{-l} c)^2 + 2(a + 2^{-l} c)(b + 2^{-l} d) p \\ &\quad + (b + 2^{-l} d)^2 p^2 + (b + 2^{-l} d)^2 q^2 \\ &= (a + 2^{-l} c)^2 + 2(a + 2^{-l} c)(b + 2^{-l} d) p + (b + 2^{-l}d)^2 D \\ &= a^2 + 2 \cdot 2^{-l} ac + 2^{-2l} c^2 + 2 a b p + 2 \cdot 2^{-l} adp + 2 \cdot 2^{-l} bcp + 2 \cdot 2^{-2l} c d p \\ &\quad + (b^2 + 2 \cdot 2^{-l} bd + 2^{-2l} d^2) D \\ &= a^2 + 2 \cdot 2^{-l} ac + 2^{-2l} c^2 + 2 a b p + 2 \cdot 2^{-l} adp + 2 \cdot 2^{-l} bcp + 2 \cdot 2^{-2l} c d p \\ &\quad + b^2 D + 2 \cdot 2^{-l} bd D + 2^{-2l} d^2 D \end{aligned}

Now, this looks like a mess and it kinda is, but fortunately we can clean it up a bit. Recall that $a$ and $b$ are 1337-bit numbers, while $c$ and $d$ are 338-bit numbers ($l = 338$). The following table gives the approximate size of each term (when considered as an integer), which we will use to reason with soon. Note that $D = p^2 + q^2$ is on the order of $2 \times 1337$ bits.

TermSize (in bits)
$a^2$$2 \times 1337$
$2 \cdot 2^{-l} ac$$1337$
$2^{-2l}c^2$$0$
$2abp$$3 \times 1337$
$2 \cdot 2^{-l} adp$$2 \times 1337$
$2 \cdot 2^{-l} bcp$$2 \times 1337$
$2 \cdot 2^{-2l} cdp$$1337$
$b^2 D$$4 \times 1337$
$2 \cdot 2^{-l} bd D$$3 \times 1337$
$2^{-2l} d^2 D$$2 \times 1337$

Now, what we will do is divide (integer division) the entire expression by $2ab$. Because $2ab$ is around $2 \times 1337$ bits in size, we can more or less throw away any term whose size is $2 \times 1337$ or less. This leaves us with

\begin{aligned} \frac{|x|^2}{2ab} = \frac{y}{2ab} &\approx p + \frac{(b^2 + 2 \cdot 2^{-l} bd) D}{2ab} \\ &\approx p + \frac{b^2D}{2ab} + \frac{2^{-l}dD}{a} \end{aligned}

This approximation itself is very accurate, and only differs by a few bits due to the $2 \times 1337$ terms. However, we don't know $d$, so this equation isn't as useful for us. Instead, we write

\begin{aligned} \frac{y}{2ab} &\approx p + \frac{b^2 D}{2ab} + \left \lfloor \frac{2^{-l} D}{a} \right \rfloor d \\ &= p + \frac{b^2 D}{2ab} + \left \lfloor \frac{2^{-l} D}{a} \right \rfloor d + k \end{aligned}

where $|k| < 2^l$. Note that the approximation is off by a term of size approximately $l$ bits because of integer division rounding. We will omit the $\lfloor \rfloor$, but it should be understood that we are performing integer division.

### Using Hint 2

Now let's bring back the subscripts, noting that we have two instances:

\begin{aligned} \begin{cases} \frac{y_1}{2a_1 b_1} &= p + \frac{b_1^2 D}{2 a_1 b_1} + \frac{2^{-l} D}{a_1} d_1 + k_1 \\ \frac{y_2}{2a_2 b_2} &= p + \frac{b_2^2 D}{2 a_2 b_2} + \frac{2^{-l} D}{a_2} d_2 + k_2 \\ \end{cases} \end{aligned}

Finally, let's combine these two equations to eliminate $p$.

$\frac{y_1 - b_1^2 D}{2 a_1 b_1} - \frac{2^{-l} D}{a_1} d_1 - k_1 = \frac{y_2 - b_2^2 D}{2 a_2 b_2}- \frac{2^{-l} D}{a_2} d_2 - k_2$

Just so it's easier to read, let $t_j = \frac{y_j - b_j^2 D}{2 a_j b_j}$ and $s_j = \frac{2^{-l} D}{a_j}$. Then, rewriting the above equation, we have:

$t_1 - s_1 d_1 - k_1 = t_2 - s_2 d_2 - k_2$

or to put it in a more exciting way:

$f(d_1, d_2, k) = t_1 - s_1 d_1 - t_2 + s_2 d_2 - k = 0$

Now, $f$ has "small" integer roots $d_1, d_2$ and $k$. An algorithm for finding small roots of multivariate polyomials over the integers is described in this paper. Following a similar idea, defund's coppersmith implementation can also be used by working over an arbitrary ring $\mathbb{Z}/N\mathbb{Z}$ for some large $N$.

#### Lattice Approach

However, we can also recover $d_1$ and $d_2$ with a very simple lattice. Consider the lattice generated by the rows of the following matrix:

$\begin{bmatrix} s_1 & 1 & 0 \\ s_2 & 0 & 1 \\ t_1 - t_2 & 0 & 0 \end{bmatrix}$