Types and Number Representation

Note: The contents of this page were edited from the original for use in CS 31.

C Standards

Although a slight divergence, it is important to understand a bit of history about the C language.

C is the lingua franca of the systems programming world. Every operating system and its associated system libraries in common use is written in C, and every system provides a C compiler. To stop the language diverging across each of these systems where each would be sure to make numerous incompatible changes, a strict standard has been written for the language.

Officially this standard is known as ISO/IEC 9899:1999(E), but is more commonly referred to by its shortened name C99. The standard is maintained by the International Standards Organisation (ISO) and the full standard is available for purchase online. Older standards versions such as C89 (the predecessor to C99 released in 1989) and ANSI C are no longer in common usage and are encompassed within the latest standard. The standard documentation is very technical, and details most every part of the language.

It's often more interesting to note what the C standards does not define. Most importantly the standard needs to be appropriate for every architecture, both present and future. Consequently it takes care not to define areas that are architecture dependent. The "glue" between the C standard and the underlying architecture is the Application Binary Interface (or ABI) which we discuss below. In several places the standard will mention that a particular operation or construct has an unspecified or implementation dependent result. Obviously the programmer can not depend on these outcomes if they are to write portable code.

Types

As programmers, we are familiar with using variables to represent an area of memory to hold a value. In a typed language, such as C, every variable must be declared with a type. The type tells the compiler about what we expect to store in a variable; the compiler can then both allocate sufficient space for this usage and check that the programmer does not violate the rules of the type. In the example below, we see an example of the space allocated for some common types of variables.

Figure 2.2. Types
The processor sees memory only a row of bytes. Adding types to variables helps the compiler ensure that code is acting correctly. Above illustrates some common types, and how they map to memory.

The C99 standard purposely only mentions the smallest possible size of each of the types defined for C. This is because across different processor architectures and operating systems the best size for types can be wildly different.

To be completely safe programmers need to never assume the size of any of their variables, however a functioning system obviously needs agreements on what sizes types are going to be used in the system. Each architecture and operating system conforms to an Application Binary Interface or ABI. The ABI for a system fills in the details between the C standard and the requirements of the underlying hardware and operating system. An ABI is written for a specific processor and operating system combination.

Table 2.13. Standard Integer Types and Sizes
TypeC99 minimum size (bits)Common size (32 bit architecture)
char88
short1616
int1632
long3232
long long6464
PointersImplementation dependent32

Above we can see the only divergence from the standard is that int is commonly a 32 bit quantity, which is twice the strict minimum 16 bit size that the C99 requires.

Pointers are really just an address (i.e. their value is an address and thus "points" somewhere else in memory) therefore a pointer needs to be sufficient in size to be able to address any memory in the system.

Type qualifiers

The C standard also talks about some qualifiers for variable types. For example const means that a variable will never be modified from its original value and volatile suggests to the compiler that this value might change outside program execution flow so the compiler must be careful not to re-order access to it in any way.

signed and unsigned are probably the two most important qualifiers; and they say if a variable can take on a negative value or not. We examine this in more detail below.

Qualifiers are all intended to pass extra information about how the variable will be used to the compiler. This means two things; the compiler can check if you are violating your own rules (e.g. writing to a const value) and it can make optimisations based upon the extra knowledge (examined in later chapters).

Standard Types

C99 realises that all these rules, sizes and portability concerns can become very confusing very quickly. To help, it provides a series of special types which can specify the exact properties of a variable. These are defined in <stdint.h> and have the form qtypes_t where q is a qualifier, type is the base type, s is the width in bits and _t is an extension so you know you are using the C99 defined types.

So for example uint8_t is an unsigned integer exactly 8 bits wide. Many other types are defined; the complete list is detailed in C99 17.8 or (more cryptically) in the header file. [3]

It is up to the system implementing the C99 standard to provide these types for you by mapping them to appropriate sized types on the target system; on Linux these headers are provided by the system libraries.

Types in action

Below we see an example of how types place restrictions on what operations are valid for a variable, and how the compiler can use this information to warn when variables are used in an incorrect fashion. In this code, we firstly assign an integer value into a char variable. Since the char variable is smaller, we lose the correct value of the integer. Further down, we attempt to assign a pointer to a char to memory we designated as aninteger. This operation can be done; but it is not safe. The first example is run on a 32-bit Pentium machine, and the correct value is returned. However, as shown in the second example, on a 64-bit Itanium machine a pointer is 64 bits (8 bytes) long, but an integer is only 4 bytes long. Clearly, 8 bytes can not fit into 4! We can attempt to "fool" the compiler by casting the value before assigning it; note that in this case we have shot ourselves in the foot by doing this cast and ignoring the compiler warning since the smaller variable can not hold all the information from the pointer and we end up with an invalid address.

Example 2.2. Example of warnings when types are not matched
  1 
                  $ cat types.c
    #include <stdio.h>
    #include <stdint.h>
  5 
    int main(void)
    {
            char *c;
            int i;
 10 
            i = c;
            i = (int)c;
    
            return 0;
 15 }
    
    $ uname -m
    i686
    
 20 $ gcc -Wall -o types types.c
    types.c: In function 'main':
    types.c:19: warning: assignment makes integer from pointer without a cast
    
    $ ./types
 25 i is 52
    p is 0x80484e8
    p is 0x80484e8
    
    $ uname -m
 30 ia64
    
    $ gcc -Wall  -o types types.c
    types.c: In function 'main':
    types.c:19: warning: assignment makes integer from pointer without a cast
 35 types.c:21: warning: cast from pointer to integer of different size
    types.c:22: warning: cast to pointer from integer of different size
    
    $ ./types
    i is 52
 40 p is 0x40000000000009e0
    p is 0x9e0
    
                

Number Representation

Negative Values

With our modern base 10 numeral system we indicate a negative number by placing a minus (-) sign in front of it. When using binary we need to use a different system to indicate negative numbers.

There is only one scheme in common use on modern hardware, but C99 defines three acceptable methods for negative value representation.

Sign Bit

The most straight forward method is to simply say that one bit of the number indicates either a negative or positive value depending on it being set or not.

This is analogous to mathematical approach of having a + and -. This is fairly logical, and some of the original computers did represent negative numbers in this way. But using binary numbers opens up some other possibilities which make the life of hardware designers easier.

However, notice that the value 0 now has two equivalent values; one with the sign bit set and one without. Sometimes these values are referred to as +0 and -0 respectively.

Ones' Complement

Ones' complement simply applies the not operation to the positive number to represent the negative number. So, for example the value -90 (-0x5A) is represented by ~01011010 = 10100101[4]

With this scheme the biggest advantage is that to add a negative number to a positive number no special logic is required, except that any additional carry left over must be added back to the final value. Consider

Table 2.15. Ones' Complement Addition
DecimalBinaryOp
-9010100101+
10001100100 
----------- 
101000010019
 0000101010

If you add the bits one by one, you find you end up with a carry bit at the end (highlighted above). By adding this back to the original we end up with the correct value, 10

Again we still have the problem with two zeros being represented. Again no modern computer uses ones' complement, mostly because there is a better scheme.

Two's Complement

Two's complement is just like ones' complement, except the negative representation has one added to it and we discard any left over carry bit. So to continue with the example from before, -90 would be ~01011010+1=10100101+1 = 10100110.

This means there is a slightly odd symmetry in the numbers that can be represented; for example with an 8 bit integer we have 2^8 = 256 possible values; with our sign bit representation we could represent -127 thru 127 but with two's complement we can represent -128 thru 127. This is because we have removed the problem of having two zeros; consider that "negative zero" is (~00000000 +1)=(11111111+1)=00000000 (note discarded carry bit).

Table 2.16. Two's Complement Addition
DecimalBinaryOp
-9010100110+
10001100100 
----------- 
1000001010 

You can see that by implementing two's complement hardware designers need only provide logic for addition circuits; subtraction can be done by two's complement negating the value to be subtracted and then adding the new value.

Similarly you could implement multiplication with repeated addition and division with repeated subtraction. Consequently two's complement can reduce all simple mathematical operations down to addition!

All modern computers use two's complement representation.

Sign-extension

Becuase of two's complement format, when increasing the size of signed value, it is important that the additional bits be sign-extended; that is, copied from the top-bit of the existing value.

For example, the value of an 32-bit int -10 would be represented in two's complement binary as 11111111111111111111111111110110. If one were to cast this to a 64-bit long long int, we would need to ensure that the additional 32-bits were set to 1 to maintain the same sign as the original.

Thanks to two's complement, it is sufficient to take the top bit of the exiting value and replace all the added bits with this value. This processes is referred to as sign-extension and is usually handled by the compiler in situations as defined by the language standard, with the processor generally providing special instructions to take a value an sign-extended it to some larger value.

Floating Point

So far we have only discussed integer or whole numbers; the class of numbers that can represent decimal values is called floating point.

To create a decimal number, we require some way to represent the concept of the decimal place in binary. The most common scheme for this is known as the IEEE-754 floating point standard because the standard is published by the Institute of Electric and Electronics Engineers. The scheme is conceptually quite simple and is somewhat analogous to "scientific notation".

In scientific notation the value 123.45 might commonly be represented as 1.2345x102. We call 1.2345 the mantissa or significand, 10 is the radix and 2 is the exponent.

In the IEEE floating point model, we break up the available bits to represent the sign, mantissa and exponent of a decimal number. A decimal number is represented by sign × significand × 2^exponent.

The sign bit equates to either 1 or -1. Since we are working in binary, we always have the implied radix of 2.

There are differing widths for a floating point value -- we examine below at only a 32 bit value. More bits allows greater precision.

Table 2.17. IEEE Floating Point
SignExponentSignificand/Mantissa
SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM

The other important factor is bias of the exponent. The exponent needs to be able to represent both positive and negative values, thus an implied value of 127 is subtracted from the exponent. For example, an exponent of 0 has an exponent field of 127, 128 would represent 1 and 126 would represent -1.

Each bit of the significand adds a little more precision to the values we can represent. Consider the scientific notation representation of the value 198765. We could write this as 1.98765x106, which corresponds to a representation below

Table 2.18. Scientific Notation for 1.98765x10^6
100.10-110-210-310-410-5
1.98765

Each additional digit allows a greater range of decimal values we can represent. In base 10, each digit after the decimal place increases the precision of our number by 10 times. For example, we can represent 0.0 through 0.9 (10 values) with one digit of decimal place, 0.00 through 0.99 (100 values) with two digits, and so on. In binary, rather than each additional digit giving us 10 times the precision, we only get two times the precision, as illustrated in the table below. This means that our binary representation does not always map in a straight-forward manner to a decimal representation.

Table 2.19. Significands in binary
20.2-12-22-32-42-5
1.1/21/41/81/161/32
1.0.50.250.1250.6250.03125

With only one bit of precision, our fractional precision is not very big; we can only say that the fraction is either 0 or 0.5. If we add another bit of precision, we can now say that the decimal value is one of either 0,0.25,0.5,0.75. With another bit of precision we can now represent the values 0,0.125,0.25,0.375,0.5,0.625,0.75,0.875.

Increasing the number of bits therefore allows us greater and greater precision. However, since the range of possible numbers is infinite we will never have enough bits to represent any possible value.

For example, if we only have two bits of precision and need to represent the value 0.3 we can only say that it is closest to 0.25; obviously this is insufficient for most any application. With 22 bits of significand we have a much finer resolution, but it is still not enough for most applications. A double value increases the number of significand bits to 52 (it also increases the range of exponent values too). Some hardware has an 84-bit float, with a full 64 bits of significand. 64 bits allows a tremendous precision and should be suitable for all but the most demanding of applications (XXX is this sufficient to represent a length to less than the size of an atom?)

Example 2.3. Floats versus Doubles
  1 
                  $ cat float.c
    #include <stdio.h>
    
  5 int main(void)
    {
            float a = 0.45;
            float b = 8.0;
    
 10         double ad = 0.45;
            double bd = 8.0;
    
            printf("float+float, 6dp    : %f\n", a+b);
            printf("double+double, 6dp  : %f\n", ad+bd);
 15         printf("float+float, 20dp   : %10.20f\n", a+b);
            printf("dobule+double, 20dp : %10.20f\n", ad+bd);
    
            return 0;
    }
 20 
    $ gcc -o float float.c
    
    $ ./float
    float+float, 6dp    : 8.450000
 25 double+double, 6dp  : 8.450000
    float+float, 20dp   : 8.44999998807907104492
    dobule+double, 20dp : 8.44999999999999928946
    
    $ python
 30 Python 2.4.4 (#2, Oct 20 2006, 00:23:25)
    [GCC 4.1.2 20061015 (prerelease) (Debian 4.1.1-16.1)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 8.0 + 0.45
    8.4499999999999993
 35 
    
                

A practical example is illustrated above. Notice that for the default 6 decimal places of precision given by printf both answers are the same, since they are rounded up correctly. However, when asked to give the results to a larger precision, in this case 20 decimal places, we can see the results start to diverge. The code using doubles has a more accurate result, but it is still not exactly correct. We can also see that programmers not explicitly dealing with float values still have problems with precision of variables!



[3] Note that C99 also has portability helpers for printf. The PRI macros in <inttypes.h> can be used as specifiers for types of specified sizes. Again see the standard or pull apart the headers for full information.

[4] The ~ operator is the C language operator to apply NOT to the value. It is also occasionally called the ones' complement operator, for obvious reasons now!