On December 12, 2009, we factored the 768-bit, 232-digit number RSA-768 by the number

field sieve. The number RSA-768 was taken from the now obsolete RSA Challenge list as a representative 768-bit RSA modulus. This result is a record for factoring general integers. Factoring a 1024-bit RSA modulus would be about a thousand times harder, and a 768-bit RSA modulus is several thousands times harder to factor than a 512-bit one. Because the first factorization of a 512-bit RSA modulus was reported only a decade ago it is not unreasonable to expect that 1024-bit RSA moduli can be factored well within the next decade by an academic effort such as ours. Thus, it would be prudent to phase out usage of 1024-bit RSA within the next three to four years.

The following effort was involved. We spent half a year on 80 processors on polynomial selection. This was about 3% of the main task, the sieving, which was done on many hundreds of machines and took almost two years. On a single core 2.2 GHz AMD Opteron processor with 2 GB RAM per core, sieving would have taken about fifteen hundred years. This included a generous amount of oversieving, to make the most cumbersome step, the matrix step, more manageable. Preparing the sieving data for the matrix step took a couple of weeks on a few processors, the final step after the matrix step took less than half a day of computing, but

took about four days of intensive labor because a few bugs had to be fixed.

It turned out that we had done about twice the sieving strictly necessary to obtain a usable matrix, and that the extra data allowed generation of a matrix that was quite a bit easier than anticipated at the outset of the project. Although we spent more computer time on the sieving than required, sieving is a rather laid back process that, once running, does not require much care beyond occasionally restarting a machine. The matrix step, on the other hand, is a more subtle affair where a slight disturbance easily causes major trouble, in particular if the problem is by its sheer size stretching the available resources. Thus, our approach to overspend on an easygoing part of the computation led to a matrix that could be handled relatively smoothly, thereby saving us considerable headaches. More importantly, and another reason behind the oversieving, the extra sieving data allow us to conduct various experiments aimed at getting a better understanding about the relation between sieving and matrix efforts and the effect on NFS feasibility and overall performance. This is ongoing research, the results of which will be reported elsewhere. All in all, the extra sieving cycles were well spent.

These figures imply that much larger matrices are already within reach, leaving preciously little doubt about the feasibility by the year 2020 of a matrix required for a 1024-bit NFS factorization. As part of the experiments mentioned above we also intend to study if a single large cluster would be able to handle such matrices using the block Lanczos algorithm (cf. [8]). Compared to block Wiedemann this has advantages (a shorter, single sequence of iterations and no tedious and memory-hungry central Berlekamp-Massey step [40]) but disadvantages as well (it cannot be run on separate clusters and each iteration consists of a multiplication by both a matrix and its transpose).

At this point factoring a 1024-bit RSA modulus looks more than five times

easier than a 768-bit RSA modulus looked back in 1999, when we achieved the first public factorization of a 512-bit RSA modulus. Nevertheless, a 1024-bit RSA modulus is still about one thousand times harder to factor than a 768-bit one. If we are optimistic, it may be possible to factor a 1024-bit RSA modulus within the next decade by means of an academic effort on the same limited scale as the effort presented here.

Another conclusion from our work is that we can quite confidently say that if we restrict ourselves to an open community, academic effort as ours and unless something dramatic happens in factoring, we will not be able to factor a 1024-bit RSA modulus within the next five years (cf. [29]). After that, all bets are off.

The ratio between sieving and matrix time was almost 10. This is probably not optimal if one wants to minimize the overall runtime.

Minimization of runtime may not be the most important criterion. Sieving is easy, and doing more sieving may be a good investment if it leads to a less painful matrix step.

Our computation required more than 1020 operations. With the equivalent of almost 2000 years of computing on a single core 2.2GHz AMD Opteron, on the order of 267 instructions were carried out.