As the name implies, the GCD of two integers is the largest integer that will evenly divide into both of them. Given two integers represented in prime factor form, determining the GCD is trivial - you simply identify all of the factors that they have in common and take the product of them.
So, on the surface, it would seem that, given two arbitrary integers, the most logical thing to do to find the GCD would be to break each integer into its prime factor representation. Unfortunately, doing that for all but relatively small integers is an exceeding hard thing to do - so hard that it has its own name, the Prime Factorization Problem. In fact, it is so hard that the security of some modern cryptographic ciphers, such as RSA, rely on it entirely. It is one of the classic "hard problems" in mathematics that untold mathematicians (and countless others) have devoted endless time, effort, and money trying to find a way to factor integers that is significantly more efficient that brute force. Surprisingly, while no one has ever developed such a method, neither has anyone ever been able to prove that such a method doesn't exist.
Does this mean that finding the GCD of two integers is equally as hard as finding their prime factorization? It might seem so at first glance, but the problems are really completely different. Given a single integer and the task of finding its prime factorization, we have nothing to start from other then trying to find numbers that will evenly divide into it. But, given two integers and the task of finding a factor that they both have in common, we can leverage the fact that we have that second, known number and that we know that if such a factor exists it is shared by both. In essence, we can use interactions between the two numbers in order to reveal the factor that they share.
That is what the Euclidean Algorithm does, but instead of simply presenting the algorithm in a cookbook fashion, we will instead develop it as we attempt to find a solution to this problem.
In this section, we will show that
gcd(x, y) = gcd(y mod x, x)
which, in the next section, will be used to develop the Euclidean Algorithm for finding gcd(x,y) in a very efficient manner.
We are given the task of finding the G such that it is the GCD of the two positive integers x and y. The mathematical expression for this is:
G = gcd(x, y)
If x and y are the same, then the answer is trivial, G = x = y. If they aren't the same, then obviously one is larger than the other. We'll agree to call the smaller of the two x. If y is an integer multiple of x, then this is also easy to determine by simply dividing y by x and looking for there to be no remainder. In other words, we would discover that y mod x is zero.
But what if there is a remainder? Certainly it tells us that x is the not the GCD we are looking for, but is that all it tells us? Let's examine this issue a bit more closely.
The remainder is the non-negative number r such that r is less than x and
y = Kx + r
for some integer value of K.
In mathematical terms, the value r is the residue of y reduced modulo x, or
r = y mod x
The question we are trying to answer right now is whether r tells us anything about GCD(x,y). So let's see what affect G, which we have defined as being the GCD of x and y (even if we don't happen to know what it is yet), has on the value of r.
If G is the GCD of x and y, then both numbers can be written as integer multiples of it.
x = KxG
y = KyG
where Kx and Ky represent the remaining factors of x and y, respectively, and which we know do not have any factors in common - otherwise they would have been factored out and made a part of G.
Combining the equations above yields
KyG = K(KxG) + r
which, when solved for r, becomes
r = KyG - K(KxG) = G(Ky - KKx)
This tells us that any factor shared by x and y is also shared by r.
But before we go rushing ahead, there is a fine point that we have overlooked. It is one thing to prove that r contains GCD(x, y) as a factor. However, doing so does not prove that r cannot contain additional factors in common with either x or y. It is not enough just to know that GCD(r, x) is a multiple of GCD(x, y), we must know that they are, in fact, equal.
Well, what happens if they aren't? What does that mean? It means that r and x share a factor that is not shared by y. Let's call that factor B.
x = KxGB
y = KyG
r = KrBG
Where, as before, the subscripted constants represent the factors of their respective values that are not already accounted for. As a consequence, no two of them share any factors - they are known as pair-wise relatively prime.
We already know that
y = Kx + r
meaning that
r = y - Kx = KyG - K(KxBG) = G(Ky - KKxB) = KrBG
This, in turn, means
G(Ky - KKxB) = KrBG
KrB = (Ky - KKxB)
Kr = (Ky/B) - (KKx)
Since Kr is an integer, the right hand side must evaluate to an integer. This in turn means that B must evenly divide Ky. But, Ky is a factor of y and B was specifically defined as not having any factors in common with y, thus we have a contradiction and our original premise, that there can exist a B (greater than 1) that is a factor of both r and x without also being a factor of y, is fale.
At this point we can state the following conclusion, which is the basis of the Euclidean Algorithm.
gcd(x, y) = gcd(y mod x, x)
In the previous section we established that
gcd(x, y) = gcd(r, x)
where
r = y mod x
Given two pairs of numbers, (x, y) and (r, x), both of which have the same GCD, does it matter which pair we use to find it? Of course not. And since it doesn't, why not use the pair having the smaller numbers? Since r is smaller than x, and x is smaller than y, that pair will be (r, x).
So, given the task of finding gcd(x, y), we can first find a third number, r, that shares the same GCD but is smaller than either and then set out to find GCD(r, x). There is nothing to prevent us from repeating this procedure again, finding yet another number that is smaller than both r and x and that also has the same GCD as r and x and, hence, the same GCD as x and y.
We can repeat this process as many times as necessary. At each step we get a smaller value of r. But since r must always contain G as a factor, the smallest value that r can ever become is G. How will we know when that happens? Simple, the next step will involve dividing a number that is a multiple of G by G leaving us with a remainder of zero.
But what if the original two numbers don't share any factors? In point of fact, they have to - they will always both be multiples of 1. In fact, one way of stating the definition of relatively prime is that two positive integers, x and y, are relatively prime if and only if gcd(x, y) is 1.That fact will be born out just as though it were any other common factor.
At this point, the actual algorithm should be somewhat anticlimatic:
Notice how the number of steps required to determine the GCD of two numbers is not strongly influenced by how large the numbers are, as demonstrated by the last two examples. This raises the question of why finding the GCD of 377 and 610 required so many steps? Some insight into this might be gained by considering why finding the CGD of 59479 and 118965 required so few?
Another angle to come at the question from is to recognize that the numbers in the list decrease monotonically as the algorithm proceeds. If we want the list to be long, then we want the amount by which the numbers decrease to be as slow as possible. What relationship between any three consecutive numbers in the list is required to minimize the decrease?
The remainder of this problem is, as the saying goes, left as an exercise for the reader.