The C++ Programming Language – Part 6

This post will be mostly for personal reference as I go through Bjarne Stroustrop’s “The C++ Programming Language” 4th edition textbook. Some of the notes will appear random.

Chapter 6 – Types and Declarations

C++ adheres to a standard. What that means is that the language has a well-defined implementation; the programmer knows what he can and can’t rely on. It is still possible to write terrible code with C++, even though the language itself is compliant to ISO standards. Lots of important nuances are “implementation-defined” by the standard. Others are “unspecified” meaning there is a range of possible behaviors. An example of unspecified behavior is “new” … the C++ creators have no idea which heap address you’ll get. Lastly there is “undefined” behavior. Once undefined behavior is triggered, the standard really has no idea what will happen. This chapter gives a basic overview of the C++ types, most of which I will not be documenting here. One notable exception is the wchar_t type, which is designed to hold wider character set encodings like Unicode. The size of wchar_t is implementation defined and large enough to hold the largest character set supported by the implementation’s locale. The _t notation is usually used to indicate an alias, but in this case wchar_t is a distinct type. The choice for char meaning unsigned char or signed char is implementation defined, meaning that each unique implementation of the standard must provide a well-defined behavior for the construct and that the behavior must be documented. In the code sample below, the use of implementation defined behavior leads to undefined behavior:

#include <iostream>

using namespace std;

int main()
{
    char c = 255;   // 0xFF
    int i = c;      // undefined behavior
    cout << c << endl;
    cout << i << endl;
    return 0;
}

Output:

user@ubuntu:~/cpp/part_2/chapter_6$ ./example1
�
-1

If the system you run this on a machine that defines char as unsigned char, you’ll see 255. On the other hand, running it on a machine where char is signed char, you’ll see -1. “Attempts to ensure that some values are positive by declaring variables as unsigned will typically be defeated by the implicit conversion rules.” Unlike plain chars, plain ints are always signed. Using void* as a return type in a function declaration says that the return type is a pointer to object of unknown type. The reason for providing many different types (long int, long long int, unsigned long long int, etc) is to allow the programmer to take advantage of hardware features. Using the sizeof() function returns the size as a multiple of char size, typically an 8-bit byte. size_t is provided by cstddef, its an implementation defined unsigned integer type that can hold the size in bytes of every object. Another notable type in cstddef is ptrdiff_t which is used to hold the result of subtracting two pointers to get a number of elements. “The type’s size is chosen so that it can store the maximum size of a theoretically possible array of any type.” For example:

#include <iostream>
#include <cstddef>

using namespace std;

int main()
{
    // used for array size
    const size_t N = 10;

    // allocate a 10 int array 
    int* a = new int[N];

    // get pointer to end of array
    int* end = a + N;

    // loop backwards through the array
    for (ptrdiff_t i = N; i > 0; --i) {

        // for each pointer, set the value to i
        cout << (* (end - i) = i) << ' ';

    }
    
    delete[] a;

    return 0;
}

Output:

user@ubuntu:~/cpp/part_2/chapter_6$ ./ptrdiff 
10 9 8 7 6 5 4 3 2 1

An object doesn’t just need enough space to hold its data. It also needs space for metadata and should be properly byte-aligned. The reason for alignment is efficiency of access; lots of hardware is designed to access data in chunks. Typically you won’t come across issues here, but its very important to keep in mind when trying to write high performance code. You can view the byte alignment of built-in and user-defined types with alignof():

#include <iostream>

using namespace std;

struct T {
    int a;
    double b;
    char c;
};

int main()
{
    auto ac = alignof(char);
    auto ai = alignof(int);
    auto at = alignof(T);

    cout << "char alignment: " << ac << endl;
    cout << "int alignment: " << ai << endl;
    cout << "T alignment: " << at << endl;

    return 0;
}

Output:

user@ubuntu:~/cpp/part_2/chapter_6$ ./align 
char alignment: 1
int alignment: 4
T alignment: 8

Declarations typically consist of: <optional prefix specifier> <base type> <declarator with optional name> <optional suffix> <optional initializer>. A few examples:

const char* cars[] = { "MX-5", "Civic", "Range" };
int x = 8;
volatile int* p;

Postfix declarator operators bind more tightly than prefix ones:

char *kings[];  // array of pointers to char
char (*kings)[] // pointer to an array of char

Old C/C++ used to assume int was the type when no type was given. This was widely regarded as a bad move and was removed.

Scope is very important in C++, as it is in all languages. The currently existing scopes are local scope, class scope, namespace scope, global scope, statement scope, and function scope.

Local scope: A name declared in a function or lambda. Exists between {} aka a block.
Class scope: Member defined in a class object or lambda. Exists between {} in class declaration.
Namespace scope: Self explanatory. For example, using namespace std now looks in C++ std namespace for namespace members like std::sizeof().
Global scope: Defined outside of any function, class, or namespace. Technically, in global namespace.
Statement scope: Within loops
Function scope: Within function

There are multiple ways to initialize variables in C++. The most error-resistant way which was introduced in C++11 is the initializer list. Using this syntax prevents narrowing and helps developers avoid insidious bugs. The features of {} initialization include prevention of integer conversion to another integer that can’t hold its value, same logic for floating-point variables, floating-point cannot be converted to int, and vice versa. Don’t forget to always initialize your variables. Global/namespace/local static/static members are initialized by default, for example int a; in the global namespace is interpreted as int a{}; and set to zero. Stack and heap variables of built-in types are not initialized by default, though. Some user-defined types have default initialization but you should not rely on that unless you deeply understand the type you’re using or wrote it yourself.

X a1 {v};   // use this
X a2 = {v}; // use this if first one is unavailable
X a3 = v;   // risky
X a4(v);    // risky