Numbers are magic; magic by their properties, their existence, and most importantly, their huge impact on our day-to-day modern life. Everything can be reduced and narrowed down to numbers. In this second part of our series on large numbers, we’ll delve more deeply into this magic.
If you take a walk in the park, a poet might say that you are evading the hectic schedule of your daily life in a dramatic environment. However, a physician or engineer would say that at 15:34 PM you started to move from coordinates X0, Y0 to X, Y at a rate of 1 meter per second, while you are burning 1000kJ, but mostly you are being slowed down by the wind blowing in front of you with 8 m/s.
You see, everything that happens around us can be reduced to numbers, which are the topic of this article. Mainly we will be discussing very large or very precise numbers.
For anyone who has just found this article, let me say it that this is the second part of a three part saga. All of them are published here on the Dev Shed Network, so if you missed the first one don’t hesitate to make a search as we’re going to pick up from where we left off in the last one; I won’t cover the ground already crossed in the previous article.
As you may know, in the first part we made the skeleton of the class within which we implement large numbers and also made sure that the time required for the reading procedure of the files or the algorithms themselves (the addition and subtraction) is as optimal as possible.
After completing the basic part, it’s time to get into some more complex operations, specifically multiplication. We shall break the ice with the elementary school method, detailing its procedures and major disadvantages. After that I’ll introduce you to the solution to our dilemma, namely Karatsuba’s algorithm, and then explain the reasons why should we use it.
And since we don’t like wasting time, in the end we will implement what we’ve learned in C++ using STL classes. So again I warn you to empower yourself with some knowledge in this domain to fully understand the code snippets that are going to be presented here. Here we go now… get ready!
Multiplication is different from the functions we’ve completed so far. There must be a reason why children remember mostly from their primary school learning one thing – and that’s the multiplication. So the obvious question is, how much more complex is it?
The Romans themselves with their highly evolved society and the aqueducts had many problems with multiplication. The Romans’ problem was that they did not use a radix (or base) number system. For instance, multiplication can be, of course, perceived and reduced to a repeated addition, but before you hop down to implement it in some pure code, think of the optimality of multiplying 9999 by 556445. For sure it would take a long time, which is unacceptable by any standards.
However, if we approach the problem again with the standard algorithm taught to us in the elementary school the situation already looks a lot better. All that we have to do is to start from the end of one of the elements and multiply with every digit the other number (note: to multiply this out to even 9 additions, for example, can take time).
At any rate, this is not too problematic. More than that, shifting each item with the position of the count it occupies from the back and adding all of the results, we end up with the final result.
An example:
999*123
2997
1998 which can be perceived as 2997 + 19980 + 99900.
999
122877
This results in a complexity of O(n2). It is not too bad; however, we can still improve on this. Don’t throw away this method, however, as we shall come back to it. Nevertheless, the computation can take an awfully long time even with this method. Remember that addition had a complexity of O(n) and ran, for two 10 MB files, for 8.5 seconds. The difference may seem little, but imagine the difference for very, very large numbers.
For example, what about two 1MB numbers? These stand for 1024*1024 digits. Raise it to a power of two. That’s 1024*1024*1024*1024, a significant difference at least. Anyway, some highly intelligent people looked over this method and found an even faster method. Shall we start learning it? Let’s begin!
This algorithm was discovered by Anatolii Alexeevich Karatsuba in 1960 and published in a joint paper with Yu Ofman in 1962. The idea behind it is to do shifts faster than with the aforementioned method. Here, the necessary time is linear.
Karatsuba’s Algorithm can be considered one of the first binary splitting algorithms or what is more commonly known today as “divide-et-impera.” Divide and conquer said Julius Caesar and, thus, this is the way Karatsuba’s algorithm works too.
For a start every number can have the following form:
x= x1 * Bm + x2
Where B is the base number (in our case 10 as we are working in the decimal system) and m is a number that’s smaller than the digit count of the x. Now just do the same for the other number. Let it be y and we’re going to have the following:
y= y1 * Bm + y2
Executing in this form the multiplication yields the equation below:
x*y = (x1*y1)* B2m + (x1*y2 + x2*y1)*Bm + x2*y2.
In this form we have four multiplications but each of them is made out of digits shorter than the initial one. There is also an addition, but as I have suggested earlier, the addition can be considered so trivial (hence executed faster) that we can almost ignore the time difference required to finish its execution. However this alone isn’t enough. Karatsuba observed that the upper expression can be reduced to only three multiplications; check it out below.
X = x1*y1
Y = x2*y2
Z = X1*y2 + x2*y1 = (x1*y2 + x2*y1 + x1*y1 + x2*y2) – x1*y1 -x2*y2
= [ x1( y1 + y2) + x2(y1 + y2)] -x1*y1 -x2*y2
= (x1+x2) * ( y1+y2) – X -Y
Now we have 3 multiplications for smaller numbers than initially and also 2 additions and two subtractions. Because the addition and subtraction take linear time O(n), multiplication is much faster by this method. Furthermore, we can call Karatsuba’s method to multiply for all of the three methods.
As for the m, it’s most advised and efficient to make the two created numbers equal, and for this let it be the number of digits present divided by two. By implementing the algorithm in this way, its complexity can be reduced to around O(nlog2/log3) = O(n1,58). There is quite a difference to be observed.
For example:
Multiply: 324 * 1010
Step 1: 324 = 3 * 102 + 24 | X = 3*10 = 30
| -> Y = 24*10 = 240
1010 = 10 * 102 + 10 | Z = (3+24) * (10 +10) – X-Y = 270
And the result is : X* 102*2 + Z* 102 + Y = 30*104 + 270*102 + 240 = 327240
If you still don’t get the idea take a look here. A nice applet shows you the process in great detail. However you can observe that this algorithm serves best when the X and Y are composed of the same number of digits. And if you do some tests you could observe that the Classic Method is faster for smaller numbers. This occurs for numbers less than 2320 (around there to be more exact).
To fight back and overcome this, most of the Large Number Libraries, like the one we are writing now, switches back to the classic method after reaching this border. To implement this you will find that I defined a SWITCH_TO_CLASSIC value and set it to 121.
It turned out to be the best value after some tests. If you want to try it out yourself just change this and run the program for two numbers — let’s say 10K long (1024*10 digits).
#define SWITCH_TO_CLASSIC 121
Now that you have a good grasp of the algorithm you could also take a look at the more sophisticated (however also harder to implement) functions like the Toom-Cook multiplication, the faster Schönhage-Strassen algorithm, or implement this one that we’ve covered. If your option is the latter – then just turn the page.
As we passed through the previous pages we learned all that we are going to need in order to produce the desired result. Because we have real numbers and not only integers we’ll resolve this problem by calling Karatsuba’s algorithm for the two numbers and calculate the place of the comma in the result after finishing the operation. We simply have to add the real length of the two numbers. This is realized by the following line.
And here let me present Karatsuba’s method:
Large_Numbers
Large_Numbers::Karatsuba (Large_Numbers& ls, Large_Numbers& rs)
{
Large_Numbers result; // Here we’ll store the result
if (ls.all < SWITCH_TO_CLASSIC && rs.all < SWITCH_TO_CLASSIC)
{
ClassicMultiply (result, ls, rs);
}
else
{
unsigned long int size = std::max (ls.all, rs.all) / 2;
string::iterator ls_end = ls.number.end (); string::iterator rs_end = rs.number.end ();
string::iterator ls_begin = ls.number.begin ();
string::iterator rs_begin = rs.number.begin ();
long int leftLeftLength = ls.all – size;
if (leftLeftLength < 0) leftLeftLength = 0;
long int rightLeftLength = rs.all -size;
if (rightLeftLength < 0) rightLeftLength = 0;
string::iterator ls_middle = ls_begin + leftLeftLength;
string::iterator rs_middle = rs_begin + rightLeftLength;
Large_Numbers x1;
if (leftLeftLength)
x1.Set (string (ls_begin, ls_middle),
std::distance (ls_begin, ls_middle));
Large_Numbers x2;
x2.Set (string (ls_middle, ls_end),
std::distance (ls_middle, ls_end));
Large_Numbers y1;
if (rightLeftLength)
y1.Set (string (rs_begin, rs_middle),
std::distance (rs_begin, rs_middle));
Large_Numbers y2;
y2.Set (string (rs_middle, rs_end),
std::distance (rs_middle, rs_end));
Large_Numbers sumX = x1 + x2;
Large_Numbers sumY = y1 + y2;
Large_Numbers X = Karatsuba(x1, y1)
Large_Numbers Y = Karatsuba(x2, y2);
Large_Numbers Z = Karatsuba (sumX, sumY) – X – Y;
X.ShiftIt (2*size);
Z.ShiftIt (size);
result = X + Y + Z;
}
return result;
}
Brief explanation: If we’re in the range simply calculate the result with the Classic method. Otherwise, find the middle, divide the numbers, calculate the sum of the elements, call Karatsuba’s algorithm to calculate the multiplications, shift the items, and add the items. Return the result.
Now all we need to do is check out the results! For some obvious reasons now I won’t venture to multiply two numbers that are 10 MB because it would take an awfully long time on my decent system (AMD Sempron 3000+ with default clocks and 1 GB DDR400 memory with a Western Digital 200 GB ATA133 Hard Disk). But look first for two 10K numbers and after it two 100K numbers.
And a little longer for the 100K numbers:
As you can see multiplication is a bit more complex than addition and takes much more time to complete. Also here is as promised the source code with the class implementation, so you can skim through it freely and see how it works with a debugger eventually. The project that is included is created with Visual Studio 2005 so with earlier versions you can’t open it, but if you create a new solution it should work.