M(3326400) = 23326400-1
We are seldom interested in the full decimal expansion of 23,326,400-1. This Mersenne number is highly composite. The exponent is a highly composite 11-smooth number, which results in many Algebraic Factors and Aurifeuillian Factors; we are usually interested in the Primitive Part of one of these factors. When necessary, the full decimal expansion of 23,326,400-1 is 1,001,347 digits.
Prime factors of Mersenne numbers are occasionally useful to mathematicians. For example, NFSNET factored M(713)=2713-1 to help in Richard Brent and Paul Zimmerman's research into primitive trinomials, which have applications in cryptography, coding theory, and random number generation. Factors of Mersenne numbers are also central to the study of covering sets for Sierpinski, Riesel, and Brier numbers. The Cunningham Project has been collecting Mersenne factors (and other factors) since 1925. There is enough interest that Will Edgington updates the file of known factors every few weeks on his web site.
If a number has a 15 digit factor, then each run of the Elliptical Curve Method with B1=2000 has one chance in 25 of finding that factor. Smaller factors will be found with higher probability, larger factors will be found with smaller probabilility. For 20 digit factors we set B1=11000; that gives one chance in ninety of finding a 20 digit factor. We run 25 curves with B1=2000, then 90 curves with B1=11000, then 300 curves with B1=50000, etc. The ECMserver program keeps track of the curves and raises the B1 value when enough curves have been run.
In reality, the chance of finding a number depends on the B2 value, too. The previous paragraph is correct for the traditional use of B2 = 100 * B1. The newest version of the ECM uses higher values of B2, so it has higher probability of finding a factor. But there isn't yet a new version of the ECMserver.
Very few of the people that use ECM understand how it works. The best that most of us can do is to understand Jim Howell's explanation. If you already understand elliptic curve arithmetic, you might understand Paul Zimmerman's description of ECM in The Elliptic Curve Method.
If a divides n, then we know from algebra that (xa-1) divides (xn-1) and sometimes (xa+1) divides (xn+1). Hence, for every divisor "a" of the exponent "n", we know that (2a-1) is a factor of (2n-1) and that sometimes (2a+1) is a factor of (2n+1). The factors (2a-1) and sometimes (2a+1) are called Algebraic Factors.
For the "plus" case, "sometimes" happens only when "a" divides "n" odd number of times. That is, if (n/a) is an odd number, then (xa+1) divides (xn+1) and (2a+1) divides (2n+1) and (2a+1) is called an Algebraic Factor of (2n+1).
Because 33 and 77 (and other numbers) divide 231, we know that (233-1) and (277-1) (and other numbers) are algebraic factors of (2231-1). You can remove the contributions of these algebraic factors from (2231-1). When you have removed all contributions of all algebraic factors, the part that is left is called the primitive part.
Note that to calculate the primitive part you cannot just divide by the algebraic factors. In the example, (233-1) and (277-1) both have factors of (211-1). If you divide (2231-1) by both of these numbers, you will divide by the factor (211-1) twice, but (2231-1) has this factor only once. Calculation of the primitive primitive part requires compensating for these multiple contributions.
When we find factors, we report them to Will Edgington according to the primitive part they belong to. Will's files show only the factors of primitive parts directly. To get complete factorizations you must also look up the factors of the algebraic factors.
As described above, we need to account for the multiple contributions that happen because small algebraic factors are themselves factors of multiple large algebraic factors. To organize this, first list all the factors in ranks, where the factors in each rank have the same number of prime factors. For the exponenet 231 (=3 x 7 x 11), there are four ranks:
|1. Three Primes:||231|
|2. Two Primes:||21, 33, 77|
|3. One Prime:||3, 7, 11|
|4. Zero Primes:||1|
Consider the case of (2125-1). When we divide this by the algebraic factor (225-1), we have also divided by the algebraic factor (25-1), but we only divided by this factor once. There is no need to also multiply by (25-1) because there has been no multiple division to compensate for.
First find the product of the distinct odd primes taken only once - this is the "basic" exponent. Form the expression for the primitive part of the number with the basic exponent. Then find the ratio of the exponent to the basic exponent. Increase every exponenent in the expression by this ratio.
For example, to calculate the primitve part of (21386+1), we first find that the distinct odd primes are 3, 7, and 11, and the basic exponent is 231. The expression for the primitive part of (2231+1) is show above. The multiplicity is 1386 / 231 = 6. Hence the primitive part of (21386+1) is
Aurifeuillian factors are "kind-of" algebraic factors.
They are derived from the polynomial factorization
If "n" is an exponent that has Aurifeuillian factors, and "p" is a prime that is equal to 1 or 7 mod 8, then the "minus factor" for n, usually called Ln, divides the larger minus factor, Lpn. Likewise for the plus factors, called Mn, divides Mpn. For the other odd primes, where p is equal 3 or 5 mod 8, there is a "crossover," so Ln divides Mpn and Mn divides Lpn.
Continuing the examples, for p=7 and n=7*22=154, we have the Aurifeuillian factors:
Consider p=3 and n=3*22=66. This has the Aurifeuillian factors:
Except for the crossover issue, the divisibility and multiplicity issues for Aurifeuilian factors are identical to those issues for algebraic factors. Hence you can proceed in the same manner, determining the basic exponent, listing the factors in ranks, multiplying and dividing factors based on their rank, and apply the ratio. The small complication is that you must determine for each factor whether to use the L-Aurifeuillian (the "minus" one) or the M-Aurifeuillian (the "plus" one) to preserve the divisibility according to the divisibility rule.
Aurifeuillian factors split algebraic factors into two nearly equal components. Aurifeuillian Primitive Parts split Algebraic Primitive Parts into two nearly equal components.
The standard approximation for situations where there are many independent "opportunities" with a low probability of "success" in each opportunity is the Poisson probability distribution. This says the probability of having exactly "k" successes, if the expected (or average) number of successes is λ (lamda), is
This approximation becomes exact in the limit, as the number of "opportunities" becomes larger and the probability of success becomes smaller in such a way that the expected number of successes stays constant at λ (lambda). One use of this approximation is estimating the probability that a number has "k" divisors in a range, often for zero divisors in the range. We will also use this approximation below to estimate the probability of zero successes in "N" elliptical curve factoring attempts.
If the size range isn't too near the square root of our number, and we don't know of any other factors, then a reasonable heuristic is to observe that the probability a number "x" is prime is, according to the Prime Number Theorem, about 1/ln(x), and if x is prime, the probability x divides our number is 1/x. Adding this up, the number of factors expected in a range is:
The ECM algorithm is usually described as working on integers of 15 or 20 or 25 etc digits, depending on the value of B1. Twenty-five digits numbers range from 1024 to 1025, so we can guess that the ECM algorithm will find, roughly, numbers from 1022 to 1027 when working on twenty-five digit numbers. Ln(27/22) = 0.2048, so on average we would expect there to be about one prime factor in this range for every five composites considered. For other ranges the heuristic is:
|15 digits||2K||0.3483||35 digits||1M||0.1452||55 digits||110M||0.0918|
|20 digits||11K||0.2578||40 digits||3M||0.1268||60 digits||260M||0.0841|
|25 digits||50K||0.2048||45 digits||11M||0.1125||65 digits||850M||0.0776|
|30 digits||250K||0.1699||50 digits||43M||0.1011||70 digits||2.9G||0.0720|
These numbers apply directly for most of the primitives in the ElevenSmooth project. The major exceptions are those numbers which have Aurifeuillian factors. For those numbers we know a factorization of the primitive, and these estimates apply to each of those factors, resulting in twice as many expected factors of a given size.
The size, in digits, of the number "z" is ln(z)/ln(10).
The average size is the sum of the sizes divided by the number of factors.
Over the range 10a to 10b, this is:
First of all, there must really be a factor of the appropriate size; see the earlier question for information about this. Even if the factor exists, the probability of finding it on a particular curve depends on the B2 level as well as the B1 level. A common choice is to set B2 as 100 times B1; Previous versions of GMP-ECM used this convention, the Alpertron Java applet uses this, and ElevenSmooth uses this with Prime95 in the Special Project. Version 5 of GMP-ECM, the program ElevenSmooth uses with the ECM server, has improvements for Stage 2. To take advantage of these improvement, the default B2 value is higher, resulting in a higher probability of finding a factor on each curve. The following table shows the probabilities in the form "1 out of n".
|15 digits||2K||25||30||35 digits||1M||1800||1100||55 digits||110M||49000||22000|
|20 digits||11K||90||90||40 digits||3M||5100||2900||60 digits||260M||124000||52000|
|25 digits||50K||300||240||45 digits||11M||10600||5500||65 digits||850M||210000||83000|
|30 digits||250K||700||500||50 digits||43M||19300||9000||70 digits||2.9G||340000||120000|
We can use the Poisson Approximation (see above) to answer this question. First we find λ (lambda), the expected (or average) number of times we will find this factor in N trials. This is "N" divided by the appropriate number from the table. We are most interested in the probability of finding the factor zero times (k=0), because subtracting that from one gives the probability of finding the factor one or more times.
When you perform the "full set" of curves, the value of λ (lamda) is "1", the probabilty of not finding the factor is (1/e), and the probability of finding the factor at least once is 1-(1/e).
There are two easy ways to estimate this. Fortunately, they give the same answer. Earlier we showed that the expected number of factors from 10a to 10b is ln(b/a). We also described how to use the Poisson approximation to caluculate the probability of zero events. Put together,
When Sander found the 24 digit factor of M(26400), the 1886 digit residual cofactor turned out to be prime, too.
The likelihood can be approximated by asking,
"Among those 1886 digit numbers with no factors under 24 digits, how many of them are prime?"
We can estimate the fraction of numbers with no factors under 24 digits from
and the fraction of primes from the
Prime Number Theorem.
If the factor bound is 10a and the number is 10n, this is
We can pull together several answers in the FAQ to estimate this. For a range 10a to 10b, we know the expected number of factors. From the Poisson Approximation, we know the probability of exactly "k" factors. We know the average size of the factors in the range. We can calculate the average size of the residual cofactor after removing all "k" factors. The residual cofactor will have no factors through 10b; we can calculate the probability it is prime. Putting it all together, if we have a number near 10n that has previously had prime factors through 10a removed, and we remove the prime factors through 10b, the probability the residual is also prime and hence the factorization is complete is:
These assumptions break down if there is a nontrivial probability the factors themselves complete the factorization (i.e. the number is 10b smooth or equivalently the residual is 1). In that range a diferent approximation would be necessary. Some results are tabulated:
|300 digits||1000 digits||3000 digits||10000 digits|
|15 to 30|
|15 to 40|
|15 to 50|
|20 to 30|
|20 to 40|
|20 to 50|
|30 to 40|
|40 to 50|
As of February 16, 2004, there are 62 composites being distributed by the ECM Server, ranging from 149 to 6936 digits. They are listed in this composites file. The names at the beginning of each line are explained in the FAQ about Received Work. The ECM Progress can be tracked on the Progress Page.
Sam Wagstaff, the present keeper of the Cunningham Project, says it ”seeks to factor the numbers bn ± 1 for b = 2, 3, 5, 6, 7, 10, 11, 12, up to high powers n.” The history of the Cunningham Tables has been to extend the base 2 tables in every edition. Because 23,326,400-1 has many algebraic factors, ElevenSmooth finds factors for many exponents, including small exponents such as 21485-1 and 21575-1. You can see the ElevenSmooth factors in Mersenne order by clicking on the Mersenne column header on the Factors Page. So Elevensmooth is, at least philosophically, part of the Cunningham Project and the Cunningham Tables will grow into our factors.
The major activity of the Great Internet Mersenne Prime Search (GIMPS) is searching for prime numbers. But there is also a part of GIMPS that is factoring 2n-1 and 2n+1. ElevenSmooth is, at least philosophically, part of this aspect of GIMPS.
Not unless there is a theoretical breakthrough. Theory provides methods to find factors up to 50 digits (ECM, P-1, P+1), and theory provides methods to factor numbers up to 180 digits (250 digits if they have a special form). Faster hardware and larger networks of computers might raise these limits by 10 or 20 digits, but only the invention of a new factoring technique could take us much beyond that.
When it isn't fun anymore. Our smallest factors will become candidates for the NFSNET Project after they have been tested for factors up to 50 digits. So even if we stop finding factors, we will be making progress towards the eventual factoring of these numbers.
I was originally interested in partial coverings for the Seventeen or Bust project. A covering can be reused if the size of the new covering is a multiple of the old size. For example, a covering of size 36 can be reused within a covering of size 72. I kept multiplying by one more small number until I reached 3,326,400. The possible numbers in a covering of size "n" are exactly the prime factors of 2n-1, so I became interested in the factors of 23,326,400-1. I learned that many factors were unknown but within reach of factoring technology; thus was born the ElevenSmooth Project. None of these additional factors have been useful for Seventeen or Bust, but I've become hooked on finding them.