Skip to main content

NT Part 2: Generating Primes, Prime Test, Prime Factorization

 

Generating primes fast is very important in some problems.
Let's cut to the chase and introduce Eratosthenes's Sieve.

The main idea is the following. Suppose we want to find all primes between 2 and 50.
Iterate from 2 to 50. We start with 2. Since it is not checked, it is a prime number. Now check all numbers that are multiple of $2$ except 2. Now we move on, to number 3. It's not checked, so it is a prime number.
Now check all numbers that are multiple of $3$, except 3. Now move on to 4. We see that this is checked - this is a multiple of 2! So 4 is not a prime. We continue doing this.

Here's the implementation.

#include <stdio.h>
int primechk[21000];
 
void preprocess(void)
{
    int i, j;
    for(i=2 ; i<=20000 ; i++)
    {
        primechk[i]=1;
    }
    for(i=2 ; i<=20000 ; i++)
    {
        if(primechk[i]==1)
        {
            for(j=2 ; i*j<=20000 ; j++)
            {
                primechk[i*j]=0;
            }
        }
    }
}
 
int main(void)
{
    preprocess();
    int i, cnt=0;
    for(i=2 ; i<=20000 ; i++)
    {
        if(primechk[i]==1)
        {
            cnt++;
            printf("%d is the %dth prime!\n",i,cnt);
        }
    }
}


Okay, so what is the time complexity? To get all primes in the interval $[1,n]$, the TC of this algorithm is $O(n \log \log n)$
Very very fast. To prove this, notice that the number of iterations are something like $O(n)+\sum_{p \le n} \frac{n}{p}$ where $p$ is a prime.

Well, Merten's Second Theorem states that $\sum_{p \le n} \frac{1}{p} = \log \log n + O(1)$ (natural logarithm, by the way) so this prove the TC.

Now we know how to generate prime numbers fast. How about primality testing?

Naive solutions first. Given an integer $n$, we can check numbers up to $\sqrt{n}$ to find if a number divides $n$. If there is such a number, $n$ is composite. If not, $n$ is a prime. This gives the solution in $O(\sqrt{n})$.

Here's a "solution" in $O(c\ln n)$ using the fast exponentiation we will talk about in the next section. It's called Miller-Rabin Primality Test.
I'll introduce the deterministic version. We probably (hopefully) won't see stuff like $n> 4 \cdot 10^9$ in contests.

Here's the sketch of the algorithm. Choose some set of $a$. We will run the algorithm with different $a$s, and the more $a$s we run this algorithm with, the more accurate this primality test is going to be.

Decompose $n-1$ as $2^s \cdot d$. Then check if the following holds.
$$a^d \not\equiv 1 \pmod{n} \text{   and   }a^{2^rd} \not\equiv -1 \pmod{n} \text{    for all    } r \in [0,s-1]$$If there is an $a$ that satisfies this, $n$ is composite. If not, $n$ is a prime.

For $n<4.7 \cdot 10^9$, we can just check for $a=2,7,61$ and be sure about it.
For $n<2^{64}$, we can check for $a=2,3,5,7,11,13,17,19,23,29,31,37$ and be confident.

Now let us look at prime factorization of numbers.

Assume that we generated prime numbers using the Eratosthenes's Sieve.
If $n$ is in the "prime-generated" range, we can actually do prime factorization in $O(\log n)$.
Make another array. While we are doing the Sieve, for composite numbers, put "the first prime that verified that this number is composite" and for prime numbers, put itself. This is easy to implement.

Then we can start with $n$, and continue to divide the prime numbers in the array.

#include <stdio.h>
int primechk[21000];
int fprime[21000]; 
void preprocess(void)
{
    int i, j;
    for(i=2 ; i<=20000 ; i++)
    {
        primechk[i]=1;
    }
    for(i=2 ; i<=20000 ; i++)
    {
        if(primechk[i]==1)
        {
            fprime[i]=i;
            for(j=2 ; i*j<=20000 ; j++)
            {
                primechk[i*j]=0;
                if(fprime[i*j]==0)
                {
                    fprime[i*j]=i;
                }
            }
        }
    }
}
 
int main(void)
{
    preprocess();
    int n;
    scanf("%d",&n);
    while(n!=1)
    {
        printf("%d\n",fprime[n]);
        n=n/fprime[n];
    }
}


If not, we can just check through all primes less than $\sqrt{n}$ and divide by those primes until we can't.
If all the primes multiply up to $n$, we are done. If not, there is exactly one prime more than $\sqrt{n}$ that divides $n$.

Another algorithm involving prime factorization is Pollard's rho algorithm - since the pseudocode is simple, I'll leave you the wikipedia link. https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm#Algorithm

Comments

Popular posts from this blog

NT Part 6: Sieve of Eratosthenes(Details)

  About 2300 years ago from today, famous Greek mathematician Euclid proved that there are an infinite number of prime numbers. Since then people have been searching for these prime numbers. In 1849, one of the greatest mathematician of all time, Carl Fredrick Gauss, had identified almost all of the prime numbers within the first 3 hundred thousand whole numbers. In the age of computers, we can find large prime numbers in the blink of an eye. But to do that, we need to know a bit of programming and a 2000 year old algorithm. By the end of this tutorial, you will be able to figure out a solution on your own to Gauss’s problem. What is a Prime Number? A prime number is an integer number that is divisible by 1 and the number itself only. For example, 7 is divisible by 1 and 7 only. But 6 is not a prime number because 6 is be divisible by 2 and 3 as well. It is worth mentioning that 1 itself is not a prime number. Now if you are asked to determine if a number is a prime number, you can...

NT Part 1:GCD, LCM, Euclidean Algorithm

  The definitions of GCD and LCM are well-known, (and taught in middle school I think) I will skip the definitions. Also, since  ,  calculating GCD is equivalent to calculating LCM. Now, how do we calculate the GCD of two numbers? A naive solution would be iterating over all positive integers no more than  . This will get the GCD in  ,  very very slow. We can calculate the GCD of   in   using Euclidean Algorithm. This algorithm uses the easy-to-prove fact  ,  where   is the remainder when   is divided by  ,  or just a%b. We can now use the following code. #include <iostream> using namespace std ; int gcd ( int u, int v ) { return u % v == 0 ? v : gcd ( v,u % v ) ; } int main ( void ) { int x, y ; cin >> x >> y ; cout << gcd ( x,y ) ; } How do we prove that this algorithm is  ?  Well, let's suppose that we started with  . Then...