Chapter 3 :Data and C
A Sample Program
As before, you'll find some unfamiliar wrinkles that we'll soon iron for you.
#include "stdio.h" int main(void) { float weight; //user weight float value; //platinum equivalent printf("Are you worth your weight in platinum?\n"); printf("Let's check it out.\n"); printf("Please enter your weight in pounds: "); scanf("%f", &weight); //get input from the user value = weight * 1700 * 14.5833; //assume platinum is $1700 per ounce //14.5833 converts pounds avd. to ounces troy printf("Your weight in platinum is worth %0.2f.\n", value); printf("You are easily worth that! If platinum prices drop.\n"); printf("eat more to maintain your value.\n"); return 0; }
Tip Errors and Warnings
If you type this program incorrectly and, say, omit a semicolon, the compiler gives you a syntax error message. Even if you type it correctly, however, the compiler may give you a warning similar to “Warning—conversion from ‘double’ to ‘float,’ possible loss of data.” An error message means you did something wrong and prevents the program from being compiled. A warning , however, means you’ve done something that is valid code but possibly is not what you meant to do. A warning does not stop compilation. This particular warning pertains to how C handles values such as 1700.0. It’s not a problem for this example, and the chapter explains the warning later.
When you type this program, you might want to change the 1700.0 to the current price of the precious metal platinum. Don’t, however, fiddle with the 14.5833 , which represents the number of ounces in a pound. (That’s ounces troy, used for precious metals, and pounds avoirdupois, used for people—precious and otherwise.)
Note that “entering” your weight means to type your weight and then press the Enter or Return key. (Don’t just type your weight and wait.) Pressing Enter informs the computer that you have finished typing your response. The program expects you to enter a number, such as 156 , not words, such as too much . Entering letters rather than digits causes problems that require an if statement (C hapter 7 , “C Control Statements: Branching and Jumps”) to defeat, so please be polite and enter a number. Here is some sample output:
Are you worth your weight in platinum?
Let's check it out.
Please enter your weight in pounds: 156
Your weight in platinum is worth $3867491.25.
You are easily worth that! If platinum prices drop,
eat more to maintain your value.
Program Adjustments
Did the output for this program briefly flash onscreen and then disappear even though you added the following line to the program, as described in Chapter 2 , “Introducing C”?
getchar();
For this example, you need to use that function call twice:
getchar(); getchar();
The getchar() function reads the next input character, so the program has to wait for input. In this case, we provided input by typing 156 and then pressing the Enter (or Return) key, which transmits a newline character. So scanf() reads the number, the first getchar() reads the newline character, and the second getchar() causes the program to pause, awaiting further input.
What’s New in This Program?
- Perhaps the most outstanding new feature is that this program is interactive. The computer asks you for information and then uses the number you enter. An interactive program is more interesting to use than the noninteractive types. More important, the interactive approach makes programs more flexible. For example, the sample program can be used for any reasonable weight, not just for 156 pounds. You don’t have to rewrite the program every time you want to try it on a new person. The scanf() and
printf() functions make this interactivity possible. The scanf() function reads data from the keyboard and delivers that data to the program, and printf() reads data from a program and delivers that data to your screen. Together, these two functions enable you to establish a two-way communication with your computer (see F igure 3.1 ), and that makes using a computer much more fun.
Data Variables and Constants
A computer, under the guidance of a program, can do many things. It can add numbers, sort names, command the obedience of a speaker or video screen, calculate cometary orbits, prepare a mailing list, dial phone numbers, draw stick figures, draw conclusions, or anything else your imagination can create. To do these tasks, the program needs to work with data , the numbers and characters that bear the information you use. Some types of data are preset before a program is used and keep their values unchanged throughout the life of the program. These are constants. Other types of data may change or be assigned values as the program runs; these are variables . In the sample program, weight is a variable and 14.5833 is a constant. What about 1700.0 ? True, the price of platinum isn’t a constant in real life, but this program treats it as a constant. The difference between a variable and a constant is that a variable can have its value assigned or changed while the program is running, and a constant can’t
Data: Data-Type Keywords
The int keyword provides the basic class of integers used in C. The next three keywords ( long , short , and unsigned ) and the C90 addition signed are used to provide variations of the basic type, for example, unsigned short int and long long int . Next, the char keyword designates the type used for letters of the alphabet and for other characters, such as # , $ , % , and
* . The char type also can be used to represent small integers. Next, float , double , and the combination long double are used to represent numbers with decimal points. The _Bool type is for Boolean values ( true and false ), and _Complex and _Imaginary represent complex and imaginary numbers, respectively.
The types created with these keywords can be divided into two families on the basis of how they are stored in the computer: integer types and floating-point types.
Bits, Bytes, and Words
The terms bit , byte , and word can be used to describe units of computer data or to describe units of computer memory. We’ll concentrate on the second usage here.
The smallest unit of memory is called a bit . It can hold one of two values: 0 or 1 . (Or you can say that the bit is set to “off” or “on.”) You can’t store much information in one bit, but a computer has a tremendous stock of them. The bit is the basic building block of computer memory.
The byte is the usual unit of computer memory. For nearly all machines, a byte is 8 bits, and that is the standard definition, at least when used to measure storage. (The C language, however, has a different definition, as discussed in the “Using Characters: Type char" section later in this chapter.) Because each bit can be either 0 or 1, there are 256 (that’s 2 times itself 8 times) possible bit patterns of 0s and 1s that can fit in an 8-bit byte. These patterns can be used, for example, to represent the integers from 0 to 255 or to represent a set of characters. Representation can be accomplished with binary code, which uses (conveniently enough) just 0s and 1s to represent numbers. ( Chapter 15 , “Bit Fiddling,” discusses binary code, but you can read through the introductory material of that chapter now if you like.)
A word is the natural unit of memory for a given computer design. For 8-bit microcomputers, such as the original Apples, a word is just 8 bits. Since then, personal computers moved up to 16-bit words, 32-bit words, and, at the present, 64-bit words. Larger word sizes enable faster transfer of data and allow more memory to be accessed.
Integer Versus Floating-Point Types
For a human, the difference between integers and floating-point numbers is reflected in the way they can be written. For a computer, the difference is reflected in the way they are stored.
The Integer
An integer is a number with no fractional part. In C, an integer is never written with a decimal point. Examples are 2, –23, and 2456. Numbers such as 3.14, 0.22, and 2.000 are not integers. Integers are stored as binary numbers. The integer 7, for example, is written 111 in binary. Therefore, to store this number in an 8-bit byte, just set the first 5 bits to 0 and the last 3 bits to 1.
The Floating-Point Number
A floating-point number more or less corresponds to what mathematicians call a real number . Real numbers include the numbers between the integers. Some floating-point numbers are 2.75, 3.16E7, 7.00, and 2e–8. Notice that adding a decimal point makes a value a floating-point value. So 7 is an integer type but 7.00 is a floating-point type. Obviously, there is more than one way to write a floating-point number. We will discuss the e-notation more fully later, but, in brief, the notation 3.16E7 means to multiply 3.16 by 10 to the 7th power; that is, by 1 followed by 7 zeros. The 7 would be termed the exponent of 10.
The key point here is that the scheme used to store a floating-point number is different from the one used to store an integer. Floating-point representation involves breaking up a number into a fractional part and an exponent part and storing the parts separately. Therefore, the 7.00 in this list would not be stored in the same manner as the integer 7, even though both have the same value. The decimal analogy would be to write 7.0 as 0.7E1. Here, 0.7 is the fractional part, and the 1 is the exponent part. F igure 3.3 shows another example of floating-point storage. A computer, of course, would use binary numbers and powers of two instead of powers of 10 for internal storage. You’ll find more on this topic in C hapter 15 . Now, let’s concentrate on the practical differences:
■ An integer has no fractional part; a floating-point number can have a fractional part.
■ Floating-point numbers can represent a much larger range of values than integers can. See Table 3.3 near the end of this chapter.
■ For some arithmetic operations, such as subtracting one large number from another, floating-point numbers are subject to greater loss of precision.
■ Because there is an infinite number of real numbers in any range—for example, in the range between 1.0 and 2.0—computer floating-point numbers can’t represent all the values in the range. Instead, floating-point values are often approximations of a true value. For example, 7.0 might be stored as a 6.99999 float value—more about precision later. ■ Floating-point operations were once much slower than integer operations. However, today many CPUs incorporate floating-point processors that close the gap.
Basic C Data Types
The int Type
The int type is a signed integer. That means it must be an integer and it can be positive, negative, or zero. The range in possible values depends on the computer system. Typically, an int uses one machine word for storage. Therefore, older IBM PC compatibles, which have a 16-bit word, use 16 bits to store an int . This allows a range in values from –32768 to 32767 . Current personal computers typically have 32-bit integers and fit an int to that size. Now the personal computer industry is moving toward 64-bit processors that naturally will use even larger integers. ISO C specifies that the minimum range for type int should be from –32767 to 32767 . Typically, systems represent signed integers by using the value of a particular bit to indicate the sign.
Declaring an int Variable
The keyword int is used to declare the basic integer variable. First comes int , and then the chosen name of the variable, and then a semicolon. To declare more than one variable, you can declare each variable separately, or you can follow the int with a list of names in which each name is separated from the next by a comma. The following are valid declarations:
int erns;
int hogs, cows, goats;
You could have used a separate declaration for each variable, or you could have declared all four variables in the same statement. The effect is the same: Associate names and arrange storage space for four int-sized variables.
These declarations create variables but don’t supply values for them. How do variables get values? You’ve seen two ways that they can pick up values in the program. First, there is assignment:
cows = 112;
Second, a variable can pick up a value from a function—from scanf() , for example. Now let’s look at a third way.
Initializing a Variable
To initialize a variable means to assign it a starting, or initial , value. In C, this can be done as part of the declaration. Just follow the variable name with the assignment operator ( = ) and the value you want the variable to have. Here are some examples:
int hogs = 21;
int cows = 32, goats = 14;
int dogs, cats = 94; /* valid, but poor, form */
In the last line, only cats is initialized. A quick reading might lead you to think that dogs is also initialized to 94 , so it is best to avoid putting initialized and noninitialized variables in the same declaration statement.
In short, these declarations create and label the storage for the variables and assign starting values to each.
Printing int Values
You can use the printf() function to print int types. The %d notation is used to indicate just where in a line the integer is to be printed. The %d is called a format specifier because it indicates the form that printf() uses to display a value. Each %d in the format string must be matched by a corresponding int value in the list of items to be printed. That value can be an int variable, an int constant, or any other expression having an int value. It’s your job to make sure the number of format specifiers matches the number of values; the compiler won’t catch mistakes of that kind.
#include "stdio.h"
int main(void) {
int ten = 10;
int two = 2;
printf("Doing it right:");
printf("%d minus %d is %d.\n", ten, 2, ten - two);
printf("Doing it wrong: ");
printf("%d minus %d is %d.\n", ten); //forgot 2 arguments
return 0;
}
Compiling and running the program produced this output on one system:
Doing it right: 10 minus 2 is 8
Doing it wrong: 10 minus 16 is 1650287143
For the first line of output, the first %d represents the int variable ten , the second %d represents the int constant 2 , and the third %d represents the value of the int expression ten - two . The second time, however, the program used ten to provide a value for the first %d and used whatever values happened to be lying around in memory for the next two! (The numbers you get could very well be different from those shown here. Not only might the memory contents be different, but different compilers will manage memory locations differently.)
You might be annoyed that the compiler doesn’t catch such an obvious error. Blame the unusual design of printf(). Most functions take a specific number of arguments, and the compiler can check to see whether you’ve used the correct number. However, printf() can have one, two, three, or more arguments, and that keeps the compiler from using its usual methods for error checking. Some compilers, however, will use unusual methods of checking and warn you that you might be doing something wrong. Still, it’s best to remember to always check to see that the number of format specifiers you give to printf() matches the number of values to be displayed.
Octal and Hexadecimal
Normally, C assumes that integer constants are decimal, or base 10, numbers. However, octal (base 8) and hexadecimal (base 16) numbers are popular with many programmers. Because 8 and 16 are powers of 2, and 10 is not, these number systems occasionally offer a more convenient way for expressing computer-related values. For example, the number 65536, which often pops up in 16-bit machines, is just 10000 in hexadecimal. Also, each digit in a hexadecimal number corresponds to exactly 4 bits. For example, the hexadecimal digit 3 is 0011 and the hexadecimal digit 5 is 0101. So the hexadecimal value 35 is the bit pattern 0011 0101, and the hexadecimal value 53 is 0101 0011. This correspondence makes it easy to go back and forth between hexadecimal and binary (base 2) notation. But how can the computer tell whether 10000 is meant to be a decimal, hexadecimal, or octal value? In C, special prefixes indicate which number base you are using. A prefix of 0x or 0X (zero-ex) means that you are specifying a hexadecimal value, so 16 is written as 0x10 , or 0X10 , in hexadecimal. Similarly, a 0 (zero) prefix means that you are writing in octal. For example, the decimal value 16 is written as 020 in octal.
Displaying Octal and Hexadecimal
Just as C enables you write a number in any one of three number systems, it also enables you to display a number in any of these three systems. To display an integer in octal notation instead of decimal, use %o instead of %d . To display an integer in hexadecimal, use %x . If you want to display the C prefixes, you can use specifiers %#o , %#x , and %#X to generate the 0 , 0x , and 0X prefixes respectively. Listing 3.3 shows a short example. (Recall that you may have to insert a getchar(); statement in the code for some IDEs to keep the program execution window from closing immediately.)
#include <stdio.h>
int main(void) {
int x = 100;
printf("dec = %d; octal = %o; hex = %x\n", x, x, x);
printf("dec = %d; octal = %#o; hex = %#x\n", x, x, x);
return 0;
}
Compiling and running the program produces this output:
dec = 100; octal = 144; hex = 64
dec = 100; octal = 0144; hex = 0x64
You see the same value displayed in three different number systems. The printf() function makes the conversions. Note that the 0 and the 0x prefixes are not displayed in the output unless you include the # as part of the specifier.
Other Integer Types
C offers three adjective keywords to modify the basic integer type: short , long , and unsigned . Here are some points to keep in mind:
■ The type short int or, more briefly, short may use less storage than int, thus saving space when only small numbers are needed. Like int , short is a signed type.
■ The type long int, or long, may use more storage than int, thus enabling you to express larger integer values. Like int , long is a signed type.
■ The type long long int , or long long (introduced in the C99 standard), may use more storage than long . At the minimum, it must use at least 64 bits. Like int, long long is a signed type.
■ The type unsigned int , or unsigned , is used for variables that have only nonnegative values. This type shifts the range of numbers that can be stored. For example, a 16-bit unsigned int allows a range from 0 to 65535 in value instead of from –32768 to 32767 . The bit used to indicate the sign of signed numbers now becomes another binary digit, allowing the larger number.
■ The types unsigned long int , or unsigned long , and unsigned short int , or unsigned short , are recognized as valid by the C90 standard. To this list, C99 adds unsigned long long int , or unsigned long long .
■ The keyword signed can be used with any of the signed types to make your intent explicit. For example, short , short int , signed short , and signed short int are all names for the same type.
Declaring Other Integer Types
Other integer types are declared in the same manner as the int type. The following list shows several examples. Not all older C compilers recognize the last three, and the final example is new with the C99 standard.
long int estine;
long johns;
short int erns;
short ribs;
unsigned int s_count;
unsigned players;
unsigned long headcount;
unsigned short yesvotes;
long long ago;
Why Multiple Integer Types?
Why do we say that long and short types “may” use more or less storage than int ? Because C guarantees only that short is no longer than int and that long is no shorter than int. The idea is to fit the types to the machine. For example, in the days of Windows 3, an int and a short were both 16 bits, and a long was 32 bits. Later, Windows and Apple systems moved to using 16 bits for short and 32 bits for int and long . Using 32 bits allows integers in excess of 2 billion. Now that 64-bit processors are common, there’s a need for 64-bit integers, and that’s the motivation for the long long type.
The most common practice today on personal computers is to set up long long as 64 bits, long as 32 bits, short as 16 bits, and int as either 16 bits or 32 bits, depending on the machine’s natural word size. In principle, these four types could represent four distinct sizes, but in practice at least some of the types normally overlap.
The C standard provides guidelines specifying the minimum allowable size for each basic data type. The minimum range for both short and int is –32,767 to 32,767, corresponding to a 16-bit unit, and the minimum range for long is –2,147,483,647 to 2,147,483,647, corresponding to a 32-bit unit. (Note: For legibility, we’ve used commas, but C code doesn’t allow that option.) For unsigned short and unsigned int , the minimum range is 0 to 65,535, and for unsigned long , the minimum range is 0 to 4,294,967,295. The long long type is intended to support 64-bit needs. Its minimum range is a substantial –9,223,372,036,854,775,807 to 9,223,372,036,854,775,807, and the minimum range for unsigned long long is 0 to 18,446,744,073,709,551,615. For those of you writing checks, that’s eighteen quintillion, four hundred and forty-six quadrillion, seven hundred forty-four trillion, seventy-three billion, seven hundred nine million, five hundred fifty-one thousand, six hundred fifteen using U.S. nomenclature (the short scale or échelle courte system), but who’s counting?
When do you use the various int types? First, consider unsigned types. It is natural to use them for counting because you don’t need negative numbers, and the unsigned types enable you to reach higher positive numbers than the signed types.
Use the long type if you need to use numbers that long can handle and that int cannot. However, on systems for which long is bigger than int , using long can slow down calculations, so don’t use long if it is not essential. One further point: If you are writing code on a machine for which int and long are the same size, and you do need 32-bit integers, you should use long instead of int so that the program will function correctly if transferred to a 16-bit machine. Similarly, use long long if you need 64-bit integer values.
Use short to save storage space if, say, you need a 16-bit value on a system where int is 32-bit. Usually, saving storage space is important only if your program uses arrays of integers that are large in relation to a system’s available memory. Another reason to use short is that it may correspond in size to hardware registers used by particular components in a computer.
Integer Overflow
What happens if an integer tries to get too big for its type? Let’s set an integer to its largest possible value, add to it, and see what happens. Try both signed and unsigned types. (The printf() function uses the %u specifier to display unsigned int values .)
#include <stdio.h> int main(void) { int i = 2147483647; unsigned int j = 4294967295; printf("%d %d %d\n", i, i + 1, i + 2); printf("%u %u %u\n", j, j + 1, j + 2); return 0; }
Here is the result for our system:
The unsigned integer j is acting like a car’s odometer. When it reaches its maximum value, it starts over at the beginning. The integer i acts similarly. The main difference is that the unsigned int variable j , like an odometer, begins at 0, but the int variable i begins at –2147483648. Notice that you are not informed that i has exceeded (overflowed) its maximum value. You would have to include your own programming to keep tabs on that.2147483647 -2147483648 -2147483647 4294967295 0 1
The behavior described here is mandated by the rules of C for unsigned types. The standard doesn’t define how signed types should behave. The behavior shown here is typical, but you could encounter something different
long constants and long long constants
Normally, when you use a number such as 2345 in your program code, it is stored as an int type. What if you use a number such as 1000000 on a system in which int will not hold such a large number? Then the compiler treats it as a long int , assuming that type is large enough. If the number is larger than the long maximum, C treats it as unsigned long . If that is still insufficient, C treats the value as long long or unsigned long long , if those types are available.
Octal and hexadecimal constants are treated as type int unless the value is too large. Then the compiler tries unsigned int . If that doesn’t work, it tries, in order, long , unsigned long , long long , and unsigned long long .
Sometimes you might want the compiler to store a small number as a long integer. Programming that involves explicit use of memory addresses on an IBM PC, for instance, can create such a need. Also, some standard C functions require type long values. To cause a small constant to be treated as type long , you can append an l (lowercase L ) or L as a suffix. The second form is better because it looks less like the digit 1. Therefore, a system with a 16-bit int and a 32-bit long treats the integer 7 as 16 bits and the integer 7L as 32 bits. The l and L suffixes can also be used with octal and hex integers, as in 020L and 0x10L .
Similarly, on those systems supporting the long long type, you can use an ll or LL suffix to indicate a long long value, as in 3LL . Add a u or U to the suffix for unsigned long long , as in 5ull or 10LLU or 6LLU or 9Ull .
Printing short , long , long long , and unsigned Types
To print an unsigned int number, use the %u notation. To print a long value, use the %ld format specifier. If int and long are the same size on your system, just %d will suffice, but your program will not work properly when transferred to a system on which the two types are different, so use the %ld specifier for long . You can use the l prefix for x and o , too. So you would use %lx to print a long integer in hexadecimal format and %lo to print in octal format. Note that although C allows both uppercase and lowercase letters for constant suffixes, these format specifiers use just lowercase.
C has several additional printf() formats. First, you can use an h prefix for short types. Therefore, %hd displays a short integer in decimal form, and %ho displays a short integer in octal form. Both the h and l prefixes can be used with u for unsigned types. For instance, you would use the %lu notation for printing unsigned long types. Listing 3.4 provides an example. Systems supporting the long long types use %lld and %llu for the signed and unsigned versions.
#include <stdio.h>
int main(void) {
unsigned int un = 3000000000; //system with 32-bit int and 16-bit short
short end = 200;
long big = 65537;
long long verybig = 12345678908642;
printf("un = %u and not %d\n", un, un);
printf("end = %hd and %d\n", end, end);
printf("verybig = %lld and not %ld\n", verybig, verybig);
return 0;
}
Here is the output on one system(results can vary):
un = 3000000000 and not -1294967296
end = 200 and 200
verybig = 12345678908642 and not 12345678908642
This example points out that using the wrong specification can produce unexpected results. First, note that using the %d specifier for the unsigned variable un produces a negative number! The reason for this is that the unsigned value 3000000000 and the signed value –129496296 have exactly the same internal representation in memory on our system. So if you tell printf() that the number is unsigned, it prints one value, and if you tell it that the same number is signed, it prints the other value. This behavior shows up with values larger than the maximum signed value. Smaller positive values, such as 96, are stored and displayed the same for both signed and unsigned types.
Next, note that the short variable end is displayed the same whether you tell printf() that end is a short (the %hd specifier) or an int (the %d specifier). That’s because C automatically expands a type short value to a type int value when it’s passed as an argument to a function. This may raise two questions in your mind: Why does this conversion take place, and what’s the use of the h modifier? The answer to the first question is that the int type is intended to be the integer size that the computer handles most efficiently. So, on a computer for which short and int are different sizes, it may be faster to pass the value as an int . The answer to the second question is that you can use the h modifier to show how a longer integer would look if truncated to the size of short . The third line of output illustrates this point. The value 65537 expressed in binary format as a 32-bit number is 00000000000000010000000000000001. Using the %hd specifier persuaded printf() to look at just the last 16 bits; therefore, it displayed the value as 1. Similarly, the final output line shows the full value of verybig and then the value stored in the last 32 bits, as viewed through the %ld specifier.
Earlier you saw that it is your responsibility to make sure the number of specifiers matches the number of values to be displayed. Here you see that it is also your responsibility to use the correct specifier for the type of value to be displayed.
Match the Type printf() Specifies
Remember to check to see that you have one format specifier for each value being displayed in a printf() statement. And also check that the type of each format specifier matches the type of the corresponding display value.
Using Characters: Type
The char type is used for storing characters such as letters and punctuation marks, but technically it is an integer type. Why? Because the char type actually stores integers, not characters. To handle characters, the computer uses a numerical code in which certain integers represent certain characters. The most commonly used code in the U.S. is the ASCII code given in the table on the inside front cover. It is the code this book assumes. In it, for example, the integer value 65 represents an uppercase A . So to store the letter A , you actually need to store the integer 65 . (Many IBM mainframes use a different code, called EBCDIC, but the principle is the same. Computer systems outside the U.S. may use entirely different codes.)
The standard ASCII code runs numerically from 0 to 127. This range is small enough that 7 bits can hold it. The char type is typically defined as an 8-bit unit of memory, so it is more than large enough to encompass the standard ASCII code. Many systems, such as the IBM PC and the Apple Macs, offer extended ASCII codes (different for the two systems) that still stay within an 8-bit limit. More generally, C guarantees that the char type is large enough to store the basic character set for the system on which C is implemented.
Many character sets have many more than 127 or even 255 values. For example, there is the Japanese kanji character set. The commercial Unicode initiative has created a system to represent a variety of characters sets worldwide and currently has over 110,000 characters. The International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) have developed a standard called ISO/IEC 10646 for character sets. Fortunately, the Unicode standard has been kept compatible with the more extensive ISO/IEC 10646 standard.
The C language defines a byte to be the number of bits used by type char , so one can have a system with a 16-bit or 32-bit byte and char type.
Declaring Type char Variables
As you might expect, char variables are declared in the same manner as other variables. Here are some examples:
char response;
char itable, latan;
This code would create three char variables: response , itable , and latan .
Character Constants and Initialization
Suppose you want to initialize a character constant to the letter A . Computer languages are supposed to make things easy, so you shouldn’t have to memorize the ASCII code, and you don’t. You can assign the character A to grade with the following initialization:
char grade = 'A';
A single character contained between single quotes is a C character constant . When the compiler sees 'A' , it converts the 'A' to the proper code value. The single quotes are essential. Here’s another example:
char broiled; //declare a variable
broiled = 'T'; //OK
broiled = T //NO! Thinks T is a variable
broiled = "T"; //NO! Thinks T is a string
If you omit the quotes, the compiler thinks that T is the name of a variable. If you use double quotes, it thinks you are using a string.
Because characters are really stored as numeric values, you can also use the numerical code to assign values:
char grade = 65; /* ok for ASCII, but poor style */
In this example, 65 is type int , but, because the value is smaller than the maximum char size, it can be assigned to grade without any problems. Because 65 is the ASCII code for the letter A , this example assigns the value A to grade. Note, however, that this example assumes that the system is using ASCII code. Using 'A' instead of 65 produces code that works on any system. Therefore, it’s much better to use character constants than numeric code values.
Somewhat oddly, C treats character constants as type char rather than type char . For example, on an ASCII system with a 32-bit char and an 8-bit char , the code
char grade = 'B';
represents 'B' as the numerical value 66 stored in a 32-bit unit, but grade winds up with 66 stored in an 8-bit unit. This characteristic of character constants makes it possible to define a character constant such as 'FATE' , with four separate 8-bit ASCII codes stored in a 32-bit unit. However, attempting to assign such a character constant to a char variable results in only the last 8 bits being used, so the variable gets the value 'E' .
Nonprinting Characters
The single-quote technique is fine for characters, digits, and punctuation marks, but if you look through the table on the inside front cover of this book, you’ll see that some of the ASCII characters are nonprinting. For example, some represent actions such as backspacing or going to the next line or making the terminal bell ring (or speaker beep). How can these be represented? C offers three ways.
The first way we have already mentioned—just use the ASCII code. For example, the ASCII value for the beep character is 7, so you can do this:
char beep = 7;
The second way to represent certain awkward characters in C is to use special symbol sequences. These are called escape sequences .
Escape sequences must be enclosed in single quotes when assigned to a character variable. For example, you could make the statement
char nerf = '\n';
and then print the variable nerf to advance the printer or screen one line.
Now take a closer look at what each escape sequence does. The alert character ( \a ), added by C90, produces an audible or visible alert. The nature of the alert depends on the hardware, with the beep being the most common. (With some systems, the alert character has no effect.) The C standard states that the alert character shall not change the active position. By active position , the standard means the location on the display device (screen, teletype, printer, and so on) at which the next character would otherwise appear. In short, the active position is a generalization of the screen cursor with which you are probably accustomed. Using the alert character in a program displayed on a screen should produce a beep without moving the screen cursor.
Next, the \b , \f , \n , \r , \t , and \v escape sequences are common output device control characters. They are best described in terms of how they affect the active position. A backspace ( \b ) moves the active position back one space on the current line. A form feed character ( \f ) advances the active position to the start of the next page. A newline character ( \n ) sets the active position to the beginning of the next line. A carriage return ( \r ) moves the active position to the beginning of the current line. A horizontal tab character ( \t ) moves the active position to the next horizontal tab stop (typically, these are found at character positions 1, 9, 17, 25, and so on). A vertical tab ( \v ) moves the active position to the next vertical tab position.
These escape sequence characters do not necessarily work with all display devices. For example, the form feed and vertical tab characters produce odd symbols on a PC screen instead of any cursor movement, but they work as described if sent to a printer instead of to the screen.
The next three escape sequences ( \\ , \' , and \" ) enable you to use \ , ' , and " as character constants. (Because these symbols are used to define character constants as part of a printf() command, the situation could get confusing if you use them literally.) Suppose you want to print the following line:
Gramps sez, "a \ is a backslash."
Then use this code:
printf("Gramps sez, \"a \\ is a backslash.\"\n");
The final two forms ( \0oo and \xhh ) are special representations of the ASCII code. To represent a character by its octal ASCII code, precede it with a backslash ( \ ) and enclose the whole thing in single quotes. For example, if your compiler doesn’t recognize the alert character ( \a ), you could use the ASCII code instead:
beep = '\007';
You can omit the leading zeros, so '\07' or even '\7' will do. This notation causes numbers to be interpreted as octal, even if there is no initial 0 .
Beginning with C90, C provides a third option—using a hexadecimal form for character constants. In this case, the backslash is followed by an x or X and one to three hexadecimal digits. For example, the Ctrl+P character has an ASCII hex code of 10 (16, in decimal), so it can be expressed as '\x10' or '\X010' . F igure 3.5 shows some representative integer types.
When you use ASCII code, note the difference between numbers and number characters. For example, the character 4 is represented by ASCII code value 52. The notation '4' represents the symbol 4, not the numerical value 4.
At this point, you may have three questions:
- Why aren’t the escape sequences enclosed in single quotes in the last example ( printf("Gramps sez, \"a \\ is a backslash\"\"n"); )? When a character, be it an escape sequence or not, is part of a string of characters enclosed in double quotes, don’t enclose it in single quotes. Notice that none of the other characters in this example ( G , r , a , m , p , s , and so on) are marked off by single quotes. A string of characters enclosed in double quotes is called a character string . (C hapter 4 explores strings.) Similarly, printf("Hello!\007\n"); will print Hello! and beep, but printf("Hello!7\n"); will print Hello!7 . Digits that are not part of an escape sequence are treated as ordinary characters to be printed.
- When should I use the ASCII code, and when should I use the escape sequences? If you have a choice between using one of the special escape sequences, say ' \f' , or an equivalent ASCII code, say '\014' , use the '\f' . First, the representation is more mnemonic. Second, it is more portable. If you have a system that doesn’t use ASCII code, the '\f' will still work.
- If I need to use numeric code, why use, say, '\032' instead of 032 ? — First, using '\032' instead of 032 makes it clear to someone reading the code that you intend to represent a character code. Second, an escape sequence such as \032 can be embedded in part of a C string, the way \007 was in the first point.
Printing Characters
The printf() function uses %c to indicate that a character should be printed. Recall that a character variable is stored as a 1-byte integer value. Therefore, if you print the value of a char variable with the usual %d specifier, you get an integer. The %c format specifier tells printf() to display the character that has that integer as its code value.
#include <stdio.h>
int main(void) {
char ch;
printf("Please enter a character.\n");
scanf("%c", &ch);
printf("The code for %c is %d.\n", ch, ch);
return 0;
}
Here is a sample run:
Please enter a character.
C
The code for C is 67.
When you use the program, remember to press the Enter or Return key after typing the character. The scanf() function then fetches the character you typed, and the ampersand ( & ) causes the character to be assigned to the variable ch . The printf() function then prints the value of ch twice, first as a character (prompted by the %c code) and then as a decimal integer (prompted by the %d code). Note that the printf() specifiers determine how data is displayed, not how it is stored.
Signed or Unsigned?
Some C implementations make char a signed type. This means a char can hold values typically in the range –128 through 127. Other implementations make char an unsigned type, which provides a range of 0 through 255. Your compiler manual should tell you which type your char is, or you can check the limits.h header file, discussed in the next chapter.
As of C90, C enabled you to use the keywords signed and unsigned with char . Then, regardless of what your default char is, signed char would be signed, and unsigned char would be unsigned. These versions of char are useful if you’re using the type to handle small integers. For character use, just use the standard char type without modifiers.
The _Bool Type
The _Bool type is a C99 addition that’s used to represent Boolean values—that is, the logical values true and false . Because C uses the value 1 for true and 0 for false , the _Bool type really is just an integer type, but one that, in principle, only requires 1 bit of memory, because that is enough to cover the full range from 0 to 1.
Programs use Boolean values to choose which code to execute next.
Portable Types: stdint.h and inttypes.h
By now you’ve probably noticed that C offers a wide variety of integer types, which is a good thing. And you probably also have noticed that the same type name doesn’t necessarily mean the same thing on different systems, which is not such a good thing. It would be nice if C had types that had the same meaning regardless of the system. And, as of C99, it does—sort of.
What C has done is create more names for the existing types. The trick is to define these new names in a header file called stdint.h . For example, int32_t represents the type for a 32-bit signed integer. The header file on a system that uses a 32-bit int could define int32_t as an alias for int . A different system, one with a 16-bit int and a 32-bit long , could define the same name, int32_t , as an alias for int . Then, when you write a program using int32_t as a type and include the stdint.h header file, the compiler will substitute int or long for the type in a manner appropriate for your particular system.
The alternative names we just discussed are examples of exact-width integer types ; int32_t is exactly 32 bits, no less or no more. It’s possible the underlying system might not support these choices, so the exact-width integer types are optional.
What if a system can’t support exact-width types? C99 and C11 provide a second category of alternative names that are required. This set of names promises the type is at least big enough to meet the specification and that no other type that can do the job is smaller. These types are called minimum width types . For example, int_least8_t will be an alias for the smallest available type that can hold an 8-bit signed integer value. If the smallest type on a particular system were 16 bits, the int8_t type would not be defined. However, the int_least8_t type would be available, perhaps implemented as a 16-bit integer.
Of course, some programmers are more concerned with speed than with space. For them, C99 and C11 define a set of types that will allow the fastest computations. These are called the fastest minimum width types. For example, the int_fast8_t will be defined as an alternative name for the integer type on your system that allows the fastest calculations for 8-bit signed values.
Finally, for some programmers, only the biggest possible integer type on a system will do; intmax_t stands for that type, a type that can hold any valid signed integer value. Similarly, uintmax_t stands for the largest available unsigned type. Incidentally, these types could be bigger than long long and unsigned long because C implementations are permitted to define types beyond the required ones. Some compilers, for example, introduced the long long type before it became part of the standard.
C99 and C11 not only provide these new, portable type names, they also provide assistance with input and output. For example, printf() requires specific specifiers for particular types. So what do you do to display an int32_t value when it might require a %d specifier for one definition and an %ld for another? The current standard provides some string macros (a mechanism introduced in Chapter 4 ) to be used to display the portable types. For example, the inttypes.h header file will define PRId32 as a string representing the appropriate specifier ( d or l , for instance) for a 32-bit signed value. Listing 3.6 shows a brief example illustrating how to use a portable type and its associated specifier. The inttypes.h header file includes stdint.h , so the program only needs to include inttypes.h.
#include <stdio.h>
#include <inttypes.h>
int main(void) {
int32_t me32; //me32 a 32-bit signed variable
me32 = 45933945;
printf("First, assume int32_t is int: ");
printf("me32 = %d\n", me32);
printf("Next, let's not make any assumptions.\n");
printf("Instead, use a \"macro\" from inttypes.h: ");
printf("me32 = %" PRId32 "\n", me32);
return 0;
}
In the final printf() argument, the PRId32 is replaced by its inttypes.h definition of "d" , making the line this:
printf("me16 = %" "d" "\n", me16);
But C combines consecutive quoted strings into a single quoted string, making the line this:
printf("me16 = %d\n", me16);
Here’s the output; note that the example also uses the \" escape sequence to display double quotation marks:
First, assume int32_t is int: me32 = 45933945
Next, let's not make any assumptions.
Instead, use a "macro" from inttypes.h: me32 = 45933945
It’s not the purpose of this section to teach you all about expanded integer types. Rather, its main intent is to reassure you that this level of control over types is available if you need it. Reference Section VI, “Extended Integer Types,” in Appendix B provides a complete rundown of the inttypes.h and stdint.h header files.
Types float , double , and long double
The various integer types serve well for most software development projects. However, financial and mathematically oriented programs often make use of floating-point numbers. In C, such numbers are called type float , double , or long double . They correspond to the real types of FORTRAN and Pascal. The floating-point approach, as already mentioned, enables you to represent a much greater range of numbers, including decimal fractions. Floating-point number representation is similar to scientific notation, a system used by scientists to express very large and very small numbers. Let’s take a look.
In scientific notation, numbers are represented as decimal numbers times powers of 10. Here are some examples.
The first column shows the usual notation, the second column scientific notation, and the third column exponential notation, or e-notation, which is the way scientific notation is usually written for and by computers, with the e followed by the power of 10. F igure 3.7 shows more floating-point representations.
The C standard provides that a float has to be able to represent at least six significant figures and allow a range of at least to . The first requirement means, for example, that a float has to represent accurately at least the first six digits in a number such as 33.333333. The second requirement is handy if you like to use numbers such as the mass of the sun (2.0e30 kilograms), the charge of a proton (1.6e–19 coulombs), or the national debt. Often, systems use 32 bits to store a floating-point number. Eight bits are used to give the exponent its value and sign, and 24 bits are used to represent the nonexponent part, called the mantissa or significand , and its sign.
相關推薦
Chapter 3 :Data and C
A Sample Program As before, you'll find some unfamiliar wrinkles that we'll soon iron for you. #include "stdio.h" int main(void) { float we
Chapter 3 : Data and C - Review Questions
1. Which data type would you use for each of the following kinds of data(sometimes more than one type could be appropriate)? a. The population
Chapter 2 :Limits and Continuity
If the degree of the numerator of a rational function is one greater than the degree of the denominator, the graph has an oblique(slanted)asym
[ROS] Chinese MOOC || Chapter-3.1 Master and Node
啟動 height width 進程 robot apt 點對點 3.1 node PR2: personal robot 2 如何管理進程和它們之間的通信? node1 and 2 先在
《Java 8 in Action》Chapter 3:Lambda表示式
1. Lambda簡介 可以把Lambda表示式理解為簡潔地表示可傳遞的匿名函式的一種方式:它沒有名稱,但它有引數列表、函式主體、返回型別,可能還有一個可以丟擲的異常列表。 匿名——我們說匿名,是因為它不像普通的方法那樣有一個明確的名稱:寫得少而想得多! 函式——我們說它是函式,是因為Lambda函式不像方
chapter 3: Variables and Expressions - Beginning C# 7 Programming with Visual Studio 2017
Perhaps the most basic description of a computer program is that it is a series of operations that manipulate data. third type of comment in C#
Why Java Sucks and C# Rocks(3):Attribute與Annotation
上一篇文章裡我談了Java和C#語言中對於基礎型別的不同態度,我認為C#把基礎型別視做物件的做法比Java更有“萬物皆物件”的理念,使用起來也更為方便。此外,C#擁有一個Java 1.4所不存在的特性,即Attribute(自定義特性),而在之後的Java 5.0中也增加了類似的功能,這便是Annotatio
論文: Data-Driven Evolutionary Optimization : An Overview and Case Studies(3) 總結部分以及自己的想法
感悟: 一篇論文看完了,就覺得行業資料的而獲取以及最初的一些對資料的操作,無論是預處理,資料探勘,還是人為的製造一些資料進行輔助模型的優化,都有很重要的作用,而且也讓我覺得這個EA其實再再應用的時候是一個跨度很大的,你需要綜合各種資訊,各行業各領域的
C++筆記(3):運算符重載
存在 新的 邏輯運算符 int() 取地址 參數 spl this 函數的重載 運算符重載 1.運算符重載基礎 2.運算符重載的規則 3.重載雙目運算符 4.重載單目運算符 5.重載流插入和提取運算符 6.類型轉換 7.定義自己的st
scikit-learn:3. Model selection and evaluation
ews util tree ask efficient square esc alter 1.10 參考:http://scikit-learn.org/stable/model_selection.html 有待翻譯,敬請期待: 3.1. Cross-val
DICOM醫學圖像處理:fo-dicom網絡傳輸之 C-Echo and C-Store
通訊 過程 reading 網絡傳輸 基類 對象 last 控制流程 con 背景: 上一篇博文對DICOM中的網絡傳輸進行了介紹。主要參照DCMTK Wiki中的英文原文。通過對照DCMTK與fo-dicom兩個開源庫對DICOM標準的詳細實現,對理解
C#復習筆記(4)--C#3:革新寫代碼的方式(用智能的編譯器來防錯)
靜態 png 字段 tom 父類 保持 int http AI 用智能的編譯器來防錯 本章的主要內容: 自動實現的屬性:編寫由字段直接支持的簡單屬性, 不再顯得臃腫不堪; 隱式類型的局部變量:根據初始值推斷類型,簡化局部變量的聲明; 對象和集合初始化程序:用一個表達式就能
C#復習筆記(4)--C#3:革新寫代碼的方式(查詢表達式和LINQ to object(上))
類型 否則 表達 數據集 clas 階段 邏輯 變量 RR 查詢表達式和LINQ to object(上) 本章內容: 流式處理數據和延遲執行序列 標準查詢操作符和查詢表達式轉換 範圍變量和透明標識符 投影、過濾和排序 聯接和分組 選擇要使用的語法 LINQ中的概
C#復習筆記(4)--C#3:革新寫代碼的方式(查詢表達式和LINQ to object(下))
標識 all 麻煩 linq with write mar sel img 查詢表達式和LINQ to object(下) 接下來我們要研究的大部分都會涉及到透明標識符 let子句和透明標識符 let子句不過是引入了一個新的範圍變量。他的值是基於其他範圍變量的。let 標識
NETWORK筆記3:IP地址分類(A類 B類 C類 D類 E類)
IP地址分類(A類 B類 C類 D類 E類) IP地址由四段組成,每個欄位是一個位元組,8位,最大值是255,, IP地址由兩部分組成,即網路地址和主機地址。網路地址表示其屬於網際網路的哪一個網路,主機地址表示其屬於該網路中的哪一臺主機。二者是主從關係。 IP地址的四大型別標識的是網路中的某臺主機。
C++霧中風景番外篇3:GDB與Valgrind ,除錯程式碼記憶體的工具
寫 C++的同學想必有太多和記憶體打交道的血淚經驗了,常常被 C++的記憶體問題攪的焦頭爛額。(寫 core 的經驗了)有很多同學一見到 core 就兩眼一抹黑,不知所措了。筆者 入"坑"C++之後,在除錯 C++程式碼的過程之中,學習了不少除錯程式碼記憶體的工具。希望借這個機會來介紹一下筆者常用的工具,
資料結構實驗3:C++實現順序棧類與鏈棧類
 
翻譯:Data Pages and Data Rows
原文出自:《Pro SQL Server Internals, 2nd edition》CHAPTER 1 Data Storage Internals中的Data Pages and Data Rows一節(即P8~P14),Dmitri Korotkevitch,侵刪 資料庫中的空間分為邏輯8KB頁面。
資料結構實現 6.3:優先佇列_基於動態陣列實現(C++版)
資料結構實現 6.3:優先佇列_基於動態陣列實現(C++版) 1. 概念及基本框架 2. 基本操作程式實現 2.1 入隊操作 2.2 出隊操作 2.3 查詢操作 2.4 其他操作 3. 演算法複雜度分析
C++順序表應用3:元素位置互換之移位演算法(好好看著函式名!!)要不然就會 undefined reference to `build_table(Table&, int, int)'
順序表應用3:元素位置互換之移位演算法 Time Limit: 1000 ms Memory Limit: 570 KiB Problem Description 一個長度為len(1<=len<=1000000)的順序表,資料元素的型別為整型,將該表分