(Last Mod: 27 November 2010 21:38:41 )
Up to this point, we have only been able to print individual characters to the screen. Those characters could, of course, have been control characters that moved the active position on the display in a certain way. What if we want to do more than that?
Well, it doesn't get much better that this and we need to accept the fact that this is pretty much a fundamental limitation for us.
"What?" you might ask. What if we want to print an entire phrase, such as "Hello World!", to the screen with a single command? What if we want to print the decimal value of an ASCII code, instead of the character that it maps to, to the screen? What if we want to print the states of all the bits in a byte? What if we want to print the result of dividing 3 into 50 to three decimal places. Can't we do those things?
Not really. Not directly. In the display mode we are using, namely one of the "text modes", we can print characters to the screen. That's it.
If we want to print a phrase to the screen, we will end up printing it out character by character just as we did in our first program.
If we want to print out the decimal value of an ASCII code, say code 65, then we have to determine that we want to print two characters, the first of which is a '6' and that second of which is a '5'. Then we will call the same putc() function as before to actually send those two characters to the display.
If we want to print out the state of individual bits in an integer, we will need to determine, bit-by-bit, what the state of each bit is and then print the character '0' or the character '1' to the screen using the putc() function.
If we want to print the result of 50.0/3.0 to three decimal places, we must figure out how to determine that the first character we want to print (using putc()) is a '1', that the next one is a '6', that then we want to print a period, and then after that we want to print the characters '6', '6', and '7'.
So, you might be thinking, what if we used a different display mode?
All of the display modes available on the overwhelming majority of computers fall into one of two categories - text and graphics.
We could, if we were to use the Borland Graphics Interface (BGI) or some other non-ANSI compliant set of functions, work in a graphics display mode and thereby set the colors of individual pixels on the display. But, for our purposes here, that would be taking a huge step backwards. Now we would not only have to figure out that we want to print something that looks like a '6' on the screen, but we would have to figure out exactly which pixels to turn on in order to make that happen. In text mode, when we print the character '6' to the screen, the graphics display driver - which is usually now integrated into the graphics card - takes the code we sent it and looks up, in what is known as a "font table", the pattern of pixels that needs to be set to the present drawing color in order for us to see what we perceive as the character '6' at the active position on the screen.
Do we really want to take responsibility for doing that? In some situations the answer would be yes - because in accepting the responsibility we also gain the authority to exercise a great deal more control over what we see on the screen. In our case, we don't want to take on that responsibility so we choose to live within the constraints and limitations imposed on us by the people that did accept that responsibility and wrote the lower level primitive functions that get invoked when we call functions like putc(). As long as we can do what we need to do within the limitations of those functions, it is a very justifiable decision to do precisely that. In general, use the highest level functions that you can that will perform the desired task - whether they be ones that are provided to you or ones that you wrote yourself. You should only work at a significantly lower level if the higher level functions won't do something you really need or you are trying to learn how lower level functions work.
At this point in this course, you are learning how to implement higher level functionality using lower level functions - so you have motivation to work at a lower level than is normally necessary. So we will develop ways to accomplish the higher level tasks described above using only the putc() function and then, once we understand how this is done, we will start using the higher level functions that are provided for these purposes. In approaching things this way, we will gain an appreciation for how these functions work and, in the process, be able to better debug our programs when we use them incorrectly. We will also be able to apply the techniques we learn here to many other programming tasks - tasks for which we have no choice but to write the higher level functions ourselves.
We will start with a very basic and rather brute force way of printing a positive integer value to the screen. Let's say that we have the value 4,583 stored in a variable called n which is of type unsigned int. We know that this value is somewhere between 0 and the maximum value that an unsigned int can store. On Borland TurboC/C++ v4.5 this happens to be 65,535, but it could easily be different. For instance, on v5.0 of this same compiler it happens to be 4,294,967,295. We want our function to work regardless of what the range of values is.
The first thing we need to do for our example (45,833) is figure out that we need to print a '4'. How can we do this? There are a number of ways. The method that we will develop here is neither the best nor the worst - but it is easy to understand and quite acceptable for our needs.
Since we are displaying our number, n, in base ten, we will start by answer the following questions until the answer changes:
Now we know that we have five digits and the most significant digit represents groups of 10,000. However, there is actually a subtle trap with the above algorithm. It works find on paper, but if the largest value we can represent is 65,535, how can we ask if the present number is bigger than 100,000 - a number which we can't represent?
A big part of algorithm development is figuring out how to perform the same logical steps but in such a way that we don't violate the limits of our system. It frequently comes down to just asking the same question but is a slightly different fashion. Consider the following instead:
Now we will figure out how many such groups we have by asking the following questions:
When we finish the above code, n will have been reduced by 10,000 four times making n equal to 5833 and i equal to 4. Now, the value stored in the variable i is four and what we need to pass to the putc() function is the ASCII code for the character '4'. But we know that the characters for the digits are all in sequential order - this is required by the standard even if ASCII is not the character set used - so the code for the character '4' is simply whatever the code is for the character '0' plus 4. In the more general case, we can use '0'+i as our argument for the putc() function.
Now we would want to decrease the number we are subtracting off by a factor of ten from 10,000 to 1,000 and repeat this process. We would continue repeating this process as long as the number we are subtracting off is greater than or equal to one.
The pseudocode for the entire algorithm would therefore look like:
By writing our pseudocode in this manner, it translates very quickly to C source code, even though we have not used any language-specific elements in the pseudocode.
/* Determine how many digits there are */
m = 1;
while
( n/m >= 10 )
{
m = 10*m;
}
/* Print out the digits one-by-one */
while ( m >= 1 )
{
i = 0;
while ( n >= m )
{
n = n - m;
i = i + 1;
}
PutC('0' + i);
m = m / 10;
}
The above code uses three while() loops. This is one of the three iteration constructs available to us in C. The other two, do/while() loops and for() loops, could have been used instead. In fact, any loop written with any of the three constructs can be written with either of the other two. The only reason we have three is because because different logical looping goals may fit one construct better than the others. Whenever we have a simple initialization statement, it is best to consider using a for() loop as this ties the initialization directly to the loop construct itself. This links it more directly to the looping logic and makes it so that if we move or copy the loop to a different place we will get the initialization statement as well.
Notice that the first and third while() loops have such initialization statements and they also have simple housekeeping statements that can get absorbed into the for() loop construct as well. Doing this, we have:
/* Determine how many digits there are */
for (m = 1; n/m >= 10; m*=10 )
/* EMPTY LOOP */;
/* Print out the digits one-by-one */
while ( m >= 1 )
{
for(i = 0; n >= m; i++ )
n = n - m;
PutC('0' + i);
m = m / 10;
}
We could easily make the remaining while() loop a for() loop as well, but for() loops that do not contain all three elements are a bit harder for most people to read and so are generally avoided.
Notice that, after loading up the control expression for the first for() loop there is no code left to go in the body. This is perfectly legal, but can cause problems for humans reading the code. Some people never end a looping construct with a semicolon on the same line as putting one there that doesn't belong is a common mistake. So they get in the happen of looking for and removing any such semicolons. The two alternatives are to put an empty set of curly braces after the for() statement - which is perfectly legal - or but the semicolon on the next line. This is what we have chosen to do here, but have also chosen to put a comment explicitly telling whoever reads the code, including us a week, month, or decade later, that we know that the loop is empty and that it is not a mistake.
The big difference between the while() loop and the do/while() loop is that the former has the potential of not executing the code within the body even a single time while the latter has to execute the code in the body once just to get to the test expression for the first time. This is the only difference - unless the test expression has side effects. If it doesn't have side effects, then if a while loop would execute a total of ten times a do/while() loop will also execute ten times. A lot of people mistakenly believe that a do/while() loop will always execute one more time than an otherwise identical while() loop.
In deciding whether to use a while() loop or a do/while() loop, the biggest consideration should be whether the logical task you are trying to implement should always be performed at least one or whether it is okay not to perform it at all under at least some circumstances. In our case, we are printing out an integer and we know that we want to always print at least one digit, even if it is a single '0'.
By examining the code above, it is easy to see that this behavior is guaranteed. The variable m is initialized to 1 and can only get bigger from there in the first loop. The second loop, which contains the actual PutC() call, will execute as long as the value of m is not less than 1. But, instead of having to check that so closely, we could simply make that second loop a do/while() loop because that matches the logic of the algorithm better - we want to guarantee that we print something out and a do/while() loop makes that guarantee glaringly obvious.
/* Determine how many digits there are */
for (m = 1; n/m >= 10; m*=10 )
/* EMPTY LOOP */;
/* Print out the digits one-by-one */
do
{
for(i = 0; n >= m; i++ )
n = n - m;
PutC('0' + i);
m = m / 10;
}
while ( m >= 1 );
All that is left is to wrap this into a function and we are done:
void Put_u(unsigned int n)
{
unsigned int m;
int i;
/* Determine how many digits there are */
for (m = 1; n/m >= 10; m*=10 )
/* EMPTY LOOP */;
/* Print out the digits one-by-one */
do
{
for(i = 0; n >= m; i++ )
n = n - m;
PutC((char)('0' + i));
m = m / 10;
}
while ( m >= 1 );
}
And we are done.
In compiling this, we get a warning about the conversion in the PutC() call possibly losing data. If we look at the putc() function, we find that it's first argument is of type int. Since i is of type int, the value passed to PutC() will be of type int, so there shouldn't be a problem. In looking at the code in <stdio.h> we discover that putc() is actually a macro, not a function (which the online help makes quite clear as well) and if we look at the function it calls we see that that function takes an argument of type char. Since the function call in the standard library is going to force an implicit conversion anyway, we could perform an explicit conversion and suppress the warnings by placing a cast operator in our PutC() macro as follows:
#define PutC(c) (putc((char)(c),stdout))
While this is not unreasonable, we might choose to leave our macro alone and instead cast the We could perform a cast operation in the PutC() - as shown above - but we will have to do this for every PutC() call - or putc() call, for that matter - that we ever make in which we pass it a value of type int. But this is not necessary a bad thing. While a bit inconvenient, it forces us to evaluate each instance we do so in order to make sure that we really do not expect to pass it values that are outside the range of valid ASCII codes. The added inconvenience is a small price to pay if the compiler can now catch just a single instance where we make such a mistake. That is why we have chosen the latter route.
Our Put_u() function is extremely useful, but what if we wanted to print out in number bases besides base ten? How hard would it be to modify our Put_u() function to do this? As long as our number base is ten or less, it is not a problem at all. Notice that everyplace that we do something special because we want the output in base ten we have the value 10 hard-coded in the function. By simply replacing this with a variable that is passed in, we have our new, more powerful function:
void Put_ubase(unsigned int n, int base)
{
/* NOTE: 2 <= base <= 10 */
unsigned int m;
int i;
/* Determine how many digits there are */
for (m = 1; n/m >= base; m*=base )
/* EMPTY LOOP */;
/* Print out the digits one-by-one */
do
{
for(i = 0; n >= m; i++ )
n = n - m;
PutC((char)('0' + i));
m = m / base;
}
while ( m >= 1 );
}
A slight change to the function header and changing "10" to "base" in three places and we are done. At this point we could replace our orginal Put_u() with a simple function-like macro as follows:
#define DEFAULT_BASE (10) /* Must be 2 through 10 */
#define Put_u(n) (Put_ubase((n), DEFAULT_BASE))
Not keeping both functions has a powerful advantage - if some test case reveals an error in our logic - perhaps we used a ">" operator where we should have used a ">=" operator - we will track it down to the one place that, by fixing it, fixes the problem in both functions. For just this reason, one guiding rule of programming is to avoid performing the same task with multiple pieces of code if it can be done by calling a single piece of code from multiple places.
Notice that we documented the limitations on the value of base both in the function itself and where we declared the default display base. Making such notes prominent is a very valuable practice to get into.
What if we want to create a function called PutH_u() displays values in hexadecimal. It would be nice to be able to simply do the following:
The only problem is that it doesn't work - at least not right. It produces the correct results as long as none of the alphabetic characters are needed. When that is the case, it prints the wrong characters. The problem is easy to identify and fix - if the value of i that we use in the argument list for PutC() is less than 10, everything is fine. But if it is 10 or more, what we want to do is start printing the alphabetic characters starting with 'A' and so forth. If we can implement this simple rule, our range of number bases immediately jumps from 10 to 36.
Instead of adding this capability into our Put_ubase() function directly, let's take a step back and ask if this is perhaps not a capability that we would like to have available at a lower level so that we can use it as a building block in other places. The answer is pretty obviously yes. So let's write a function or macro that performs the following logic:
This is pretty simple logic so we will use a function-like macro to do it:
#define PutD(d) (PutC( (char) ((d)<10)?('0'+(d)):('A'+(d)-10) ))
We have also embedded the type cast to char in this macro because we know we want to cast values of type int to char, just as we did in the earlier function body, and we know that if the value we pass is bigger than 35, regardless of whether the type is char or int or anything else, we will have problems.
Collecting everything we need for these functions and macros to work in one place, we have:
#include <stdio.h> /* putc() */
#define PutC(c) (putc((char)(c),stdout))
#define DEFAULT_BASE (10) /* Must be 2 through 36 */
#define PutD(d) (PutC( (char) ((d)<10)?('0'+(d)):('A'+(d)-10) ))
#define Put_u(n) (Put_ubase((n), DEFAULT_BASE))
#define PutH_u(n)(Put_ubase((n), 16))
void Put_ubase(unsigned int n, int base)
{
/* NOTE: 2 <= base <= 36 */
unsigned int m;
int i;
/* Determine how many digits there are */
for (m = 1; n/m >= base; m*=base )
/* EMPTY LOOP */;
/* Print out the digits one-by-one */
do
{
for(i = 0; n >= m; i++ )
n = n - m;
PutD(i);
m = m / base;
}
while ( m >= 1 );
}
This code, along with the driver program used to test it, can be found in the file int_out.c
We know have a set of very useful and powerful functions and macros that allow us to print out positive integer values very conveniently even though we have arguably only written seven lines of actual executable code. Even counting the blank lines and comments we are still under thirty lines of source code. Furthermore, we are only using a single function from the standard library, have only called it in one place, and all it does is print a single character to the display.
The following is a simple driver program that runs our functions and macros through enough range of values to convince us that they appear to be working fine:
int main(void)
{
int i;
for(i = 0; i < 256; i++)
{
Put_u(i);
PutC(' '); PutC(':'); PutC(' '); Put_ubase(i,2);
PutC(' '); PutC(':'); PutC(' '); Put_ubase(i,8);
PutC(' '); PutC(':'); PutC(' '); Put_ubase(i,16);
PutC(' '); PutC(':'); PutC(' '); PutH_u(i);
PutC(' '); PutC(':'); PutC(' '); Put_ubase(i,36);
PutC('\n');
}
return 0;
}