Thank you to everyone who played DownUnderCTF 2021. I hope you enjoyed the challenges :)
Here are writeups for the hard crypto challenges I wrote.
All of the challenge files and solve scripts can be found here.
|Substitution Cipher III||1|
This challenge follows a similar structure to a commonly seen style of crypto CTF challenges that involve solving the discrete logarithm problem in some group other than the usual . The general approach is to figure out some information about what the group elements are like, and then try to find an isomorphism from the group to a group where we can easily solve the discrete logarithm problem.
Group Structure Part I
We'll see that the group we are dealing with in the challenge is the set of points on some hyperbola over . Although the operations defined in the handout code are called "addition" and "multiplication", it'll be helpful to use multiplication and exponentiation notation. When we say "elements", we usually mean points on the hyperbola, which are represented as tuples with addition operation defined as:
rand_element function gives us even more information about elements in this group. In fact, we can use it to get an explicit relation that all elements satisfy. The function first generates a random , then computes as
Rearranging, we get
It can be checked that the set of solutions to this equation, equipped with addition operation above indeed does form a group. We'll call this group .
Diophantine equations of the form are known as Pell equations and are well documented in the literature. We'll see later that the equation we found from the
rand_element can be written as a Pell equation, and use that for analysing the group even more.
For now we'll take a look at how the flag is hidden and what we'll need to recover it. The 48 byte flag is broken up into 6 blocks of 8 bytes each. These blocks are represented as the integers . Then, for some randomly generated elements , the value is computed.
Given and the , the goal is to recover the . This is known as the t-multiple discrete logarithm problem, and in this case, we can solve it by solving the discrete logarithm problem in , combined with lattice techniques.
Let be the order of and let be a generator of (we will see later that is cyclic). Define the discrete logarithm to the base as , that, for any , gives us a value such that . It can be shown that this map is a homomorphism, so . Therefore,
Since we're working in , we can equivalently write
But since the are relatively small (~62 bits) compared to (which we will determine later), we can solve for the using lattice techniques. Specifically, consider the lattice generated by the rows of the matrix
Notice that times the first row, plus times the second row, etc. take times the seventh row, and an appropriate multiple of the last row, gives the short vector
which can be found with lattice basis reduction algorithms such as LLL.
Group Structure Part II
All we need left to solve the challenge is a way to solve the discrete logarithm problem in . Determining the group order is a good first step as we could use it to try and see if there is an isomorphism between and an additive group of integers of the same order where the discrete logarithm is trivially solved by division. However, it turns out that the order of is and that is isomorphic to the cyclic subgroup of of order . We follow a similar approach to  for the proof.
Claim. Let be the set of solutions to the equation
Assume that (this is true for the challenge parameters). Then, where is the cyclic subgroup of of order .
Proof. First, note that , so it suffices to show that the set of solutions to the equation (equipped with the addition operation as defined above) is isomorphic to . Let . Note that is irreducible as it is of degree 2 and has no roots, since has no square in by assumption. Therefore . Now, if , then (since has order ). But we can write for . So,
So, if we take and , then we see that . Therefore, we have the bijection . To see that this is a homomorphism, let . Then
So we have .
Therefore is an isomorphism, and .
If we take a look at the actual values in the challenge, we'll see that is smooth, with the largest factor being 32 bits! This is a hint that we're on the right path, since we'll be able to solve the discrete logarithm problem in a reasonable time with Pohlig-Hellman.
By this point, we've pretty much already solved the challenge, but as a bonus, we can find the inverse elements of . Although the solution script doesn't do this, this has the advantage of speed over using the isomorphism found above since the operations are slightly faster (than operations in ). Recall the addition formula
The identity element is clearly , so to find the inverse element of , we want to find such that . Let . Then
Another advantage of figuring out inverse elements is that you can solve the challenge without finding the isomorphism to by guessing that the order is since it is smooth, and then implementing Pohlig-Hellman and BSGS to use the addition, multiplication and inverse operations.
We are given the code running on a server that implements a signature scheme. We can ask the server to sign one message, then we are challenged to forge a signature for a random message.
The signature scheme resembles the Rabin signature algorithm. The public key is an RSA modulus whose prime factorisation is the private key. A message is signed by first computing a randomised hash (where is random) of . If has a square root modulo , a square root modulo is computed and the signature is outputted. Otherwise, we try again with a different . Note that computing square roots modulo , or even determining whether a number has a square root modulo is equivalent to factoring . To verify a signature for the message , we simply check that the equality holds.
In the challenge, the primes generated are of the form . This is done because if is a prime such that , then if a square root of exists, it can be easily computed as . To compute a square root of modulo , we compute square roots of modulo and modulo , then combine them with the Chinese Remainder Theorem.
Security of Rabin Signatures
Before we proceed, we'll take a look at an easy attack against a simplified variant of the Rabin signature algorithm. Specifically, we consider a variant that does not use a hash function at all, i.e. to sign a message the signer simply computes a square root of modulo . For simplicity, we assume that all the messages to be signed actually do have square roots modulo , though it does not really matter. There is one trivial attack; anyone can "forge" a valid signature for if is a perfect square. For example, if , then is a valid signature since .
More interestingly however, it turns out that we can recover the private key given access to a signing oracle. We do this by choosing a random and ask the oracle to sign . It returns a square root of modulo . Now, if , then we have
so reveals a nontrivial factor of .
The Hash Function
This section and the next are somewhat algebra-heavy and basic results from algebra are used without proof for brevity. It may be worthwhile to read up on finite fields, field extensions and Galois theory (for later) if they are unfamiliar concepts.
We learned the importance of a good, randomised hash function in the previous section. And more importantly, we learned that a hash function which is simply the identity map (i.e. sends any input to itself), is completely insecure. We will now take a look at the hash function in the challenge.
Choose a composite integer and a proper divisor (in the challenge we have ). Let be the smallest prime number following , where is the signer's public key. We will work in an extension field of , where is a public degree irreducible polynomial. Let . Then generates and is basis for when viewed as an -dimensional -vector space. That is, elements in can be written in the form where .
The randomised hash function takes as input a message and an integer . Write in terms of powers of :
where . Then, is converted to an element in by computing
To obtain the hash, the function computes which we write as
The output is .
Choosing a Message to be Signed
After the server provides us with its public key, it prompts us to send a message to be signed. Note that (in the
sign function) isn't chosen randomly; it starts at and increments until is a square. There is also a peculiar restriction on the message; it has to be larger than and smaller than .
If we were able to send messages of any size for the server to sign, we can easily find a message that will help us to recover the private key. Specifically, we would choose a random and send the message . The hash function will convert this to and output the constant term in which happens to just be since all elements satisfy by Fermat's Little Theorem. So when the server signs this, we have the exact same situation as the attack described two sections ago. However, the size check prevents this attack.
Recall that the identity map is insecure as a hash function for the reasons given two sections ago. It turns out that our particular hash function is the identity map on a specific subset, or rather, subfield of other than . We have
where is the element in we obtain by converting . We can write this as a composition where
The goal will be to find fixed points of , which can be done by finding fixed points of since we can easily manipulate the result of by carefully choosing . Fixed points of satisfy
Note that for a finite field of order , the elements of are given by the roots of . This follows from the fact that the multiplicative group is a cyclic group of order , so if , then and so . That is cyclic of order follows from the structure theorem for finite abelian groups which states that any finite abelian group is a direct product of cyclic groups. There are a lot of references online for these results and their proofs.
So, to solve for the roots of we simply need to look at elements in the finite field of order . Since divides , then this field is actually a subfield of because if satisfies , then it also satisfies
(and it can also be checked that actually is a field). This is good for us as it means we can write the elements in terms of which is what the server will be expecting.
Let be a generator of . Because is a subfield of , then can be written as
where . Choose a random . We will want to find an element in of the form
such that when is applied to this element, the constant term remains as which will be the output of the hash function. To do this, we will find an element in with as its constant term, and then subtract from it to account for the randomising value which we can predict will be .
The element in we are interested in is obtained by computing
To send this to the server, we encode it as an integer:
The server will compute for us a square root of and if we have , we can easily recover the private key using the technique described two sections ago.
Once we have the private key, we can use the provided functions in the handout code to sign the challenge message and capture the flag.
I was made aware of this by S3v3ru5's solve during the CTF, but choosing the message to be signed can actually be quite simple (though fundamentally relies on most of the above theory); the idea is to shift the goal posts a bit and instead of trying to find a fixed message, we find a message whose hash is something we can control. We do this by noting that since for all , then for any of our chosen , if we send , then after being hashed, the result is exactly . I imagine most teams would have solved this way instead of finding the subfield which is quite a bit more complicated. I obviously lacked the hindsight to spot this solution when writing the challenge, but it's pretty neat :)
Alternative Approach via Linearity Properties
This solution idea is due to rkm0959 who taught me this during the CTF after he solved it. Instead of looking at the fields involved, we can simply note that the hash function has some nice linearity properties. In particular
So to forge a signature for any given , we simply let and solve for :
Then, is a valid signature for .
This attack doesn't need to use the signing oracle and shows that the signature scheme is completely broken when using this hash function. Pretty cool solution!
Alternative Approach via Galois Theory
Alternatively, one might recognise the resemblance of the function with the Frobenius map which is an -automorphism that generates the Galois group of . Note that is cyclic and of order as is the identity map. The Fundamental Theorem of Galois Theory tells us that there is a one-to-one correspondence between the subgroups of the Galois group of , and the intermediate fields of . Explicitly, for a given subgroup of , the corresponding intermediate field of is given by the fixed field , the set of all elements in which are fixed by all of the maps in . Another result, sometimes known as the Fixed Field Theorem, tells us that the order of is equal to the degree of as an extension of . We use this, along with the fact that the subfields of are given by where divides , to find the fixed fields.
In the challenge, we have , so the Galois group is isomorphic to . The table below lists out the subgroups of and their corresponding intermediate fields ( denotes the identity map):
|Subgroup of||Intermediate Field of|
|(all elements are fixed by )|
|(only is fixed by all automorphisms)|
From this, we can see that is fixed by .
The challenge claims to be a more complex version of 1337crypt from last year's DUCTF. Reading the solution for 1337crypt may help a bit with some initial ideas as both challenges play with the idea of partial knowledge. In both challenges, we are given hints to recover the prime factors, but the hints themselves seem to omit some bits of information. Specifically, the hints are the integer parts of some values that should actually contain a fractional part as well. As we will see (and as you may suspect :)), lattice techniques will help us to recover this missing information and therefore the primes.
We start by looking at how the flag is encrypted so that we have a clear idea of what we need to recover it. The flag is encrypted in a way very similar to RSA. The only difference is that instead of using integers, we use Gaussian integers. If you aren't familiar with Gaussian integers, I highly recommend reading this excellent blog post by CryptoHack for some extra background.
We work in the ring (we will just write this as ) where is the RSA modulus. The flag is encoded as an element , of by choosing a random integer and computing .
The ciphertext is obtained by raising to the th power. In the challenge, is 0x1337.
To decrypt the ciphertext, we must find the multiplicative inverse of modulo the order of . (Note: in this context, when we say "the order of " we mean the cardinality of the multiplicative group of ) The order of is , so to compute it, we'll need to recover the primes.
The challenge generates 1337-bit primes and and gives us two hints, and the ciphertext. Using these hints, we need to recover and .
The first hint, which we will call , is . This seems like a pretty big hint, and indeed if we were also given we could easily recover the primes by finding the roots of a simple univariate polynomial. As far as I am aware, can't be used to directly recover and either; there are techniques such as factoring over the Gaussian integers, but this requires factoring over the integers which may take a long time. That said, if you were able to find a solution using this hint alone I'd be interested in seeing it :)
There is a lot more going on in the second hint. But firstly, some background. Number field is just a fancy name for an (finite) extension of the rational numbers . The simplest number field is itself, and the rational complex numbers, is also a number field.
We can see that a number field is constructed from and . In Sage, the first argument to NumberField is the defining polynomial. The roots of this polynomial specify the elements to be adjoined to to get the number field. We see that the defining polynomial for is , which has roots . Therefore, we define
(side note: we only need to adjoin one root since the other, it's conjugate, can be obtained from the root with usual field operations). In the handout code, denotes this adjoined element.
Now, for the values we are given. We get two instances of values with the same form. For the th hint (), two 1337-bit numbers and are generated, as well as two -bit numbers and . Then, is computed as
is computed and we are given the three values . Note that denotes the complex conjugate of , so the value gives us the squared complex modulus of . For ease of reading, we will write instead of .
Importantly, we note that is an element of and its components are rational numbers with fractional parts. Similarly, is a rational number with a fractional part. However, we are only given the integer parts of these values, so it seems like we might be missing some information.
Let's analyse the second hint in further detail and see if we can use it to recover the primes directly, or find a relation involving the primes that will help us to do so.
Analysing Hint 2
There will be some tedious algebra, so to make things a bit more readable we will drop the subscripts. Note that we use hint 1 here since appears. We have
Now, this looks like a mess and it kinda is, but fortunately we can clean it up a bit. Recall that and are 1337-bit numbers, while and are 338-bit numbers (). The following table gives the approximate size of each term (when considered as an integer), which we will use to reason with soon. Note that is on the order of bits.
|Term||Size (in bits)|
Now, what we will do is divide (integer division) the entire expression by . Because is around bits in size, we can more or less throw away any term whose size is or less. This leaves us with
This approximation itself is very accurate, and only differs by a few bits due to the terms. However, we don't know , so this equation isn't as useful for us. Instead, we write
where . Note that the approximation is off by a term of size approximately bits because of integer division rounding. We will omit the , but it should be understood that we are performing integer division.
Using Hint 2
Now let's bring back the subscripts, noting that we have two instances:
Finally, let's combine these two equations to eliminate .
Just so it's easier to read, let and . Then, rewriting the above equation, we have:
or to put it in a more exciting way:
Now, has "small" integer roots and . An algorithm for finding small roots of multivariate polyomials over the integers is described in this paper. Following a similar idea, defund's coppersmith implementation can also be used by working over an arbitrary ring for some large .
However, we can also recover and with a very simple lattice. Consider the lattice generated by the rows of the following matrix:
The short vector is an element of this matrix, given by the linear combination of times the first row, times the second row, and times the third row. LLL finds this vector.
Recovering the Primes
Now that we have , it is straightforward to recover . We use the approximation we found earlier:
This approximation is accurate to all but a few bits, which we can easily exhaust over, checking if is a square to know which candidate for is correct. We then compute