The C++ Programming Language – Part 7

This post will be mostly for personal reference as I go through Bjarne Stroustrop’s “The C++ Programming Language” 4th edition textbook. Some of the notes will appear random.

Chapter 6 – Types and Declarations (continued)

Objects can be classified based on their lifetimes.

Automatic: Created when its definition is encountered and destroyed when it goes out of scope. Typically, these are function-local objects located on the stack. Also known as a storage class.

Static: Declared in global or namespace scope. These exist until the program terminates. Also known as a storage class.

Free store: Variable lifetime, depending on the programmer's needs. Pointers can be passed from function to function, or the objects can be manually destroyed when not needed. This type of object is a common source of bugs.

Temporary: Automatic objects that are needed during computation. Lifetime is determined by use.

Thread-local: Created when their thread starts, destroyed when their thread ends. These are declared via thread_local specifier.

Type aliases are useful when we need a new name for a type. There are a few main reasons for this new name:

Original name is too long and would clutter the code.
A programming technique requires different types to have the same name in a context (?).
A specific type is mentioned in one place only to simplify maintenance.

Type specifiers are not allowed with aliases. A few examples of type aliases and their typedef equivalents:

#include <iostream>

using namespace std;

int main()
{
    /* pointer to character */
    using ptr_to_char = char*;

    /* pointer to function taking a double and returning an int */
    using ptr_to_func = int(*)(double);

    typedef int int32_t;
    /* same as */ 
    using int32_t = int;
    
    typedef short int16_t;
    /* same as */ 
    using int16_t = short;
    
    return 0;
}

Rounding out this chapter are some solid “Advice” excerpts:

Avoid unspecified and undefined behavior.

Avoid assumptions like "the size of an int is 4 bytes".

Avoid unnecessary assumptions about the range and precision of floating point types.

Prefer plain char over signed char and unsigned char.

Beware of conversion between signed and unsigned types.

Name an object to reflect its meaning rather than its type.

Use small functions.

Avoid uninitialized variables.

Do code review.

Chapter 7 – Pointers, Arrays, and References

This chapter deals with the basic language mechanisms for referring to memory. A * denotes a pointer to an object in memory. A pointer holds the address of an object. At its lowest level, object members are accessed via offsets from this pointer. In assembly language, you’ll often see things like [rax+0x8], meaning “dereference the pointer in rax, and go to offset 0x8″ which could be the ‘name’ field in a user defined object, for example. There are many possibilities with the ‘*’ aka ‘pointer to’ unary operator. You can have pointers to pointers, arrays of pointers, pointers to functions, and more. You can utilize cdelc.org to demangle this syntax as things get confusing. For example inputting “int (*fp)(char*);” to cdelc.org results in “declare fp as pointer to function (pointer to char) returning int.” In the code below, you’ll see two arrays of int pointers. Notice how one is not initialized properly, causing random stack data (possibly from previous function calls in a real-world codebase) to be included in the array. If one of those “uninitialized” values is used, good luck.

#include <iostream>

using namespace std;

int main()
{
    /* pointer to int */
    int *a;

    /* pointer to a pointer to a char */
    /* b -> ? -> char */
    char **b;

    /* array of 10 int pointers, unintialized */
    int *dirty_c[10];

    /* array of 10 int pointers, intialized with brackets */
    int *clean_c[10]{};

    /* two int pointers, one on the heap, one on the stack */
    int *sample = new int(5);
    int sample2{10};

    /* place int pointers in improperly initialized pointer array */
    dirty_c[1] = sample;
    dirty_c[8] = &sample2;

    /* place int pointers in properly initialized pointer array */
    clean_c[1] = sample;
    clean_c[8] = &sample2;

    cout << "\ndirty_c: sample is at heap memory address " << dirty_c[1] << endl;
    cout << "dirty_c: sample dereferenced is " << *dirty_c[1] << endl;
    cout << "dirty_c: sample2 is at stack memory address " << dirty_c[8] << endl;
    cout << "dirty_c: sample2 dereferenced is " << *dirty_c[8] << endl;
    cout << "dirty_c: pointer array dirty_c contains: " << endl;
    for (int i = 0; i < 10; i++) {
        cout << dirty_c[i] << endl;
    }

    cout << "\nclean_c: sample is at heap memory address " << clean_c[1] << endl;
    cout << "clean_c: sample dereferenced is " << *clean_c[1] << endl;
    cout << "clean_c: sample2 is at stack memory address " << clean_c[8] << endl;
    cout << "clean_c: sample2 dereferenced is " << *clean_c[8] << endl;
    cout << "clean_c: pointer array clean_c contains: " << endl;
    for (int i = 0; i < 10; i++) {
        cout << clean_c[i] << endl;
    }

    /* pointer to an array of 10 ints */
    int (*d)[10];

    /* pointer to a function taking a char pointer */
    int (*fp)(char*);

    /* function returning an integer pointer taking a char pointer as argument */
    int *f(char*);

    return 0;
}

Output:

user@ubuntu:~/cpp/part_2/chapter_7$ g++ -std=c++11 -o ptr_sample2 ptr_sample2.cpp 
user@ubuntu:~/cpp/part_2/chapter_7$ ./ptr_sample2 

dirty_c: sample is at heap memory address 0x1573c20
dirty_c: sample dereferenced is 5
dirty_c: sample2 is at stack memory address 0x7ffc6645167c
dirty_c: sample2 dereferenced is 10
dirty_c: pointer array dirty_c contains: 
0x602078
0x1573c20
0x400820
0x7fc3b2e7c9a0
0x7fc3b2e71660
0x400820
0x602078
0x7fc3b275d299
0x7ffc6645167c
0x7ffc66451700

clean_c: sample is at heap memory address 0x1573c20
clean_c: sample dereferenced is 5
clean_c: sample2 is at stack memory address 0x7ffc6645167c
clean_c: sample2 dereferenced is 10
clean_c: pointer array clean_c contains: 
0
0x1573c20
0
0
0
0
0
0
0x7ffc6645167c
0

At this point, it is also interesting to note the usage of **argv vs *argv[] when defining main(). The two styles mean the same thing. In both cases, argv is a pointer to a null terminated array of arguments passed via the command line. These two ways to express argv are equivalent because of operator precedence (see cppreference). The rank 2 subscript operator [] takes precedence over the rank 3 indirection * operator. Thus in the case of *argv[], argv first becomes an array of chars. Next, it becomes a pointer to an array of chars. In the case of **argv, argv first becomes a char pointer. Next it becomes an array of pointers to chars. The key here is in any array, the name of the array is a pointer to first element of the array, so argv has become a pointer to argv[0], which is the first command line argument (the program name).

A void pointer is used to pass or store an address of a memory location without knowing what is stored there. Said by Stroustrup, “pointer to an object of an unknown type.” Void pointer is primarily used for passing pointers to functions that are not allowed to make assumptions about the type of the object and for returning untyped objects from functions. To use such an object, we must use explicity type conversion. Using void* should be highly scrutinized in code review, they are most likely an indication of bad design decisions. nullptr is a literal that represents the null pointer. It is part of the standard namespace, as std::nullptr_t. The predecessor to nullptr was simply to use zero. Avoid the use of NULL as its definition can be implementation-defined.

The C++ Programming Language – Part 6

This post will be mostly for personal reference as I go through Bjarne Stroustrop’s “The C++ Programming Language” 4th edition textbook. Some of the notes will appear random.

Chapter 6 – Types and Declarations

C++ adheres to a standard. What that means is that the language has a well-defined implementation; the programmer knows what he can and can’t rely on. It is still possible to write terrible code with C++, even though the language itself is compliant to ISO standards. Lots of important nuances are “implementation-defined” by the standard. Others are “unspecified” meaning there is a range of possible behaviors. An example of unspecified behavior is “new” … the C++ creators have no idea which heap address you’ll get. Lastly there is “undefined” behavior. Once undefined behavior is triggered, the standard really has no idea what will happen. This chapter gives a basic overview of the C++ types, most of which I will not be documenting here. One notable exception is the wchar_t type, which is designed to hold wider character set encodings like Unicode. The size of wchar_t is implementation defined and large enough to hold the largest character set supported by the implementation’s locale. The _t notation is usually used to indicate an alias, but in this case wchar_t is a distinct type. The choice for char meaning unsigned char or signed char is implementation defined, meaning that each unique implementation of the standard must provide a well-defined behavior for the construct and that the behavior must be documented. In the code sample below, the use of implementation defined behavior leads to undefined behavior:

#include <iostream>

using namespace std;

int main()
{
    char c = 255;   // 0xFF
    int i = c;      // undefined behavior
    cout << c << endl;
    cout << i << endl;
    return 0;
}

Output:

user@ubuntu:~/cpp/part_2/chapter_6$ ./example1
�
-1

If the system you run this on a machine that defines char as unsigned char, you’ll see 255. On the other hand, running it on a machine where char is signed char, you’ll see -1. “Attempts to ensure that some values are positive by declaring variables as unsigned will typically be defeated by the implicit conversion rules.” Unlike plain chars, plain ints are always signed. Using void* as a return type in a function declaration says that the return type is a pointer to object of unknown type. The reason for providing many different types (long int, long long int, unsigned long long int, etc) is to allow the programmer to take advantage of hardware features. Using the sizeof() function returns the size as a multiple of char size, typically an 8-bit byte. size_t is provided by cstddef, its an implementation defined unsigned integer type that can hold the size in bytes of every object. Another notable type in cstddef is ptrdiff_t which is used to hold the result of subtracting two pointers to get a number of elements. “The type’s size is chosen so that it can store the maximum size of a theoretically possible array of any type.” For example:

#include <iostream>
#include <cstddef>

using namespace std;

int main()
{
    // used for array size
    const size_t N = 10;

    // allocate a 10 int array 
    int* a = new int[N];

    // get pointer to end of array
    int* end = a + N;

    // loop backwards through the array
    for (ptrdiff_t i = N; i > 0; --i) {

        // for each pointer, set the value to i
        cout << (* (end - i) = i) << ' ';

    }
    
    delete[] a;

    return 0;
}

Output:

user@ubuntu:~/cpp/part_2/chapter_6$ ./ptrdiff 
10 9 8 7 6 5 4 3 2 1

An object doesn’t just need enough space to hold its data. It also needs space for metadata and should be properly byte-aligned. The reason for alignment is efficiency of access; lots of hardware is designed to access data in chunks. Typically you won’t come across issues here, but its very important to keep in mind when trying to write high performance code. You can view the byte alignment of built-in and user-defined types with alignof():

#include <iostream>

using namespace std;

struct T {
    int a;
    double b;
    char c;
};

int main()
{
    auto ac = alignof(char);
    auto ai = alignof(int);
    auto at = alignof(T);

    cout << "char alignment: " << ac << endl;
    cout << "int alignment: " << ai << endl;
    cout << "T alignment: " << at << endl;

    return 0;
}

Output:

user@ubuntu:~/cpp/part_2/chapter_6$ ./align 
char alignment: 1
int alignment: 4
T alignment: 8

Declarations typically consist of: <optional prefix specifier> <base type> <declarator with optional name> <optional suffix> <optional initializer>. A few examples:

const char* cars[] = { "MX-5", "Civic", "Range" };
int x = 8;
volatile int* p;

Postfix declarator operators bind more tightly than prefix ones:

char *kings[];  // array of pointers to char
char (*kings)[] // pointer to an array of char

Old C/C++ used to assume int was the type when no type was given. This was widely regarded as a bad move and was removed.

Scope is very important in C++, as it is in all languages. The currently existing scopes are local scope, class scope, namespace scope, global scope, statement scope, and function scope.

Local scope: A name declared in a function or lambda. Exists between {} aka a block.
Class scope: Member defined in a class object or lambda. Exists between {} in class declaration.
Namespace scope: Self explanatory. For example, using namespace std now looks in C++ std namespace for namespace members like std::sizeof().
Global scope: Defined outside of any function, class, or namespace. Technically, in global namespace.
Statement scope: Within loops
Function scope: Within function

There are multiple ways to initialize variables in C++. The most error-resistant way which was introduced in C++11 is the initializer list. Using this syntax prevents narrowing and helps developers avoid insidious bugs. The features of {} initialization include prevention of integer conversion to another integer that can’t hold its value, same logic for floating-point variables, floating-point cannot be converted to int, and vice versa. Don’t forget to always initialize your variables. Global/namespace/local static/static members are initialized by default, for example int a; in the global namespace is interpreted as int a{}; and set to zero. Stack and heap variables of built-in types are not initialized by default, though. Some user-defined types have default initialization but you should not rely on that unless you deeply understand the type you’re using or wrote it yourself.

X a1 {v};   // use this
X a2 = {v}; // use this if first one is unavailable
X a3 = v;   // risky
X a4(v);    // risky

 

The C++ Programming Language – Part 5

This post will be mostly for personal reference as I go through Bjarne Stroustrop’s “The C++ Programming Language” 4th edition textbook. Some of the notes will appear random.

Chapter 5 – Concurrency and Utilities (continued)

Threads sometimes need to wait on events to do their work. For example, threads can wait for time to pass, wait for other threads to fill data buffers, pretty much anything you can think of. The classic example is the producer/consumer model. One thread processes data from a shared queue, while the other thread puts data into the shared queue. All operations on the shared queue are protected by a unique lock, making sure that only one thread can use the queue at a time. In this example, there is also a condition variable that would indicate to the processing thread that data is ready in the queue. Condition variables have a wait() function which takes a mutex and a Predicate. A C++ Predicate is “a function object that takes a single iterator argument that is dereferenced and used to return a value testable as a bool.” In the example below, the consumer thread will not try to acquire the mutex until the predicate returns true aka the queue is not empty.

// g++ -pthread -std=c++11 -o cond_var cond_var.cpp
#include <thread>
#include <iostream>
#include <mutex>
#include <queue>
#include <cstdbool>
#include <chrono>
#include <condition_variable>

using namespace std;

// globals to allow threads to work together
queue<int> mqueue;
condition_variable mcond;
mutex mmutex;

void producer()
{
    for (int i = 100; i < 105; i++) {
        // Acquire the lock.
        unique_lock<mutex> lck {mmutex};
        mqueue.push(i);
        cout << "producer - Pushed data to shared queue." << endl;
        // Unblock waiting threads.
        mcond.notify_one();
        cout << "producer - Notifying consumer and sleeping." << endl;
        // Scoped unlocking wasn't working properly for me, so I 
        // explicity unlock before looping.
        lck.unlock();
        this_thread::sleep_for( chrono::milliseconds(1000) );
    }
}

void consumer()
{
    while(true) {
        // Acquire the lock. Example of RAII.
        unique_lock<mutex> lck {mmutex};
        // Waiting on condition variable releases the mutex and attempts to 
        // reacquire once the Predicate returns true.
        mcond.wait(lck, []() { return !mqueue.empty(); });
        auto m = mqueue.front();
        mqueue.pop();
        lck.unlock();
        cout << "consumer - Got " << m << " from shared queue." << endl;
        cout << "consumer - Waiting for notify." << endl;
    }
}

int main(int argc, char *argv[])
{
    thread prod(producer);
    thread cons(consumer);

    prod.join();
    cons.join();

    return 0;
}

Output:

user@ubuntu:~/cpp/part_1/chapter_5$ ./cond_var 
producer - Pushed data to shared queue.
producer - Notifying consumer and sleeping.
consumer - Got 100 from shared queue.
consumer - Waiting for notify.
producer - Pushed data to shared queue.
producer - Notifying consumer and sleeping.
consumer - Got 101 from shared queue.
consumer - Waiting for notify.
producer - Pushed data to shared queue.
producer - Notifying consumer and sleeping.
consumer - Got 102 from shared queue.
consumer - Waiting for notify.
producer - Pushed data to shared queue.
producer - Notifying consumer and sleeping.
consumer - Got 103 from shared queue.
consumer - Waiting for notify.
producer - Pushed data to shared queue.
producer - Notifying consumer and sleeping.
consumer - Got 104 from shared queue.
consumer - Waiting for notify.
^C

Type functions are functions that is evaluated at compile-time give a type as its argument or returning a type. For example, computing the smallest supported positive float on a system. Also it could be used to find the byte width of standard types. This is an example of metaprogramming.

// g++ -std=c++11 -o smallfloat smallfloat.cpp
#include <iostream>
#include <limits>

using namespace std;

int main()
{   
    constexpr float min = numeric_limits<float>::min();
    cout << "smallest float on this system at compile time: " << min << endl;

    constexpr int szi = sizeof(int);
    cout << "int width in bytes on this system at compile time: " << szi << endl;

    return 0;
}

Output:

user@ubuntu:~/cpp/part_1/chapter_5$ ./smallfloat 
smallest float on this system at compile time: 1.17549e-38
int width in bytes on this system at compile time: 4

Random numbers are useful in many contexts, such as testing, games, simulation, and security. The diversity of application areas is reflected in the wide selection of random number generators provided by the standard library in <random>.” There are two parts to a random number generator: engine and the distribution. The engine outputs the values while the distribution maps the values into a range. Some included distributions: uniform_int_distribution, normal_distribution, and exponential_distribution. There are all kinds of possibilities with this <random> library.

// g++ -std=c++11 -o rando rando.cpp
#include <random>
#include <functional>
#include <chrono>
#include <iostream>

using namespace std;

int main()
{   
    default_random_engine engine {};
    // Seed the engine so it makes new numbers each time.
    engine.seed(std::chrono::system_clock::now().time_since_epoch().count());
    // Set the value map.
    uniform_int_distribution<> distro {1, 100};
    // Bind the engine and map together. Provided by <functional>
    // bind() makes a function object that will invoke its first argument
    // given its second argument as its argument. So, distro(engine);
    auto die = bind(distro, engine);
    for (int i = 0; i < 10; i++) {
        cout << die() << endl;
    }

    return 0;
}

It is necessary to seed the generator. Otherwise, it will output the same “random” numbers each run. Output:

user@ubuntu:~/cpp/part_1/chapter_5$ ./rando 
1
70
83
69
84
73
45
47
71
50
user@ubuntu:~/cpp/part_1/chapter_5$ ./rando 
67
99
64
61
93
62
94
82
29
49

 

The C++ Programming Language – Part 4

This post will be mostly for personal reference as I go through Bjarne Stroustrop’s “The C++ Programming Language” 4th edition textbook. Some of the notes will appear random.

Chapter 4 – Containers and Algorithms (continued)

A standard library map is a search tree data structure. It is a series of key, value pairs implemented as a balanced binary tree. This is also known as a dictionary, optimized for lookup providing O(logn) lookup performance. This fast lookup is possible because when searching a balanced binary tree, the search space is halved each time a node is traversed. This continued halving is essentially the reverse of exponential growth, which is the logarithm. The unordered map is a hashed lookup version of a map. A hash table maps keys to values using a hashing function, allowing the search space to be narrowed down very quickly even compared to a binary tree. “C++ standard library hashed containers are referred to as “unordered” because they don’t require an ordering function.” This graph from MICHAEL KAZAKOV’S QUIET CORNER shows the performance difference:

There are many useful containers in the standard library. The unordered ones are designed for fast key lookup.

vector<T> - variable sized vector
list<T> - doubly linked list
forward_list<T> - singly linked list
deque<T> - double ended queue
queue<T> - first in first out data structure
priority_queue<T> - queue in which the first element is always the greatest according to some criterion. this is implemented on top of another container
stack<T> - last in first out data structure
set<T> - stores unique elements in a specific order. elements are immutable
multiset<T> - set allowing recurring values
map<K,V> - associative array (key values aka dictionary)
multimap<K,V> - same as map with added ability to have one key with multiple values
unordered_map<K,V> - hashed lookup map vs binary tree lookup
unordered_multimap<K,V> - hashed multimap
unordered_set<T> - hashed set
unordered_multiset<T> - hashed multiset
bitset<N> - stores bits...
array<T, N> - fixed size sequence in strict linear sequence

Reinforcing the concept of iterators, we can use them to write to stdout! Constructing an ostream_iterator with the ostream_type reference as cout, whenever an assignment operator is used on the ostream_iterator, it inserts a new element into the stream. Of course, the stream in this case is stdout:

#include <iostream>
#include <iterator>

using namespace std;

int main()
{
    ostream_iterator<string> oo {cout};
    *oo = "Hello, ";
    ++oo; 
    *oo = "world.\n";
}

Another way to use iterators for input and output (thanks to a buddy for this code segment):

#include <iostream>
#include <algorithm>
#include <vector>
#include <iterator>

using namespace std;

int main()
{
    istream_iterator<int> ii {cin};
    ostream_iterator<int> oo {cout};

    /* Write a vector to cout via iterator */
    vector<int> v {1, 3, 5, 7, 9};
    copy(v.begin(), v.end(), oo);

    /* Read 5 elements into the vector via iterator */
    copy_n(ii, 5, back_inserter(v));

    /* Write it back out to cout */
    copy(v.begin(), v.end(), oo);
}

Output: (you must provide input before it prints anything):

user@ubuntu:~/cpp/part_1/chapter_4$ ./ostream_iterator 
1
135792
3
4
5
1357912345

Chapter 5 – A Tour of C++ Concurrency and Utilities

The standard library aims to serve the intersection of needs rather than the union. The result of this decision is a generally useful set of well-designed tools. C++ resource management covers the allocation and deallocation of things that are acquired at runtime. In the standard library, resources follow the “Resource Acquisition Is Initialization” also called RAII. Defined so eloquently on cppreference.com, RAII “binds the lifecycle of a resource that must be acquired before use to the lifetime of the object.” This is fundamental to the design of C++. It guarantees that having access to an object means that the necessary associated resources are also ready to be used. Additionally, when the object is released (either manually or by some other mechanism like going out of function scope), all associated resources are also released. Objects, in this sense, come in an “all or nothing” package.

unique_ptr and shared_ptr are smart pointers that help manage objects on the heap. When the unique or shared pointer is forgotten about by a careless programmer, C++ covers his butt and frees all the necessary resources before going out of scope. This is essentially programming training wheels. Unique_ptr can also be returned from functions since its a handle to an individual object. Unique_ptrs are moved, they are atomic. This is in contrast to shared_ptrs, which are copied. The object pointed to by multiple shared_ptrs will only be deallocated once the reference count hits zero. For example:

// g++ -std=c++11 -o cpp_ptrs cpp_ptrs.cpp

#include <iostream>
#include <memory> // unique_ptr and shared_ptr

using namespace std;

int main(int argc, char **argv)
{
    shared_ptr<float> sample_a (new float(5.4551));

    cout << "shared_ptr sample_a ref count " << sample_a.use_count() << endl;

    shared_ptr<float> sample_b(sample_a);

    cout << "shared_ptr sample_a ref count " << sample_a.use_count() << endl;
    cout << "shared_ptr sample_b ref count " << sample_b.use_count() << endl;
    // The pointers now share the same raw pointer.
    // Thus each has ref count of 2 since they point to same place.

    // Delete managed object sample_b.
    sample_b.reset();
    // The raw pointer will not yet be deallocated, since there is still a ref to it by sample_a
    cout << "shared_ptr sample_a ref count " << sample_a.use_count() << endl;
    cout << "shared_ptr sample_b ref count " << sample_b.use_count() << endl;
    // Delete managed object sample_a.
    sample_a.reset();

    // All memory is freed since ref count is zero.
    cout << "shared_ptr sample_a ref count " << sample_a.use_count() << endl;
    cout << "shared_ptr sample_b ref count " << sample_b.use_count() << endl;

    return 0;
}

Output:

user@ubuntu:~/cpp/part_1/chapter_5$ ./cpp_ptrs 
shared_ptr sample_a ref count 1
shared_ptr sample_a ref count 2
shared_ptr sample_b ref count 2
shared_ptr sample_a ref count 1
shared_ptr sample_b ref count 0
shared_ptr sample_a ref count 0
shared_ptr sample_b ref count 0

Smart pointers are a form of destructor-based garbage collection and relieves the programmer from many memory leaks.

Concurrency is the execution of several tasks simultaneously. This is a bit of a fallacy on uniprocessor systems (and in Python, due to the GIL aka Global Interpreter Lock). Instructions are interleaved on single processors, not actually executed in parallel. Multiprocessor systems are a different game and do offer true concurrency. If those concurrently executing threads operate on the same data, beware. The C++ standard library directly supports concurrent execution in multiple threads in a single address space. These threads have their own stack, but share a heap. Threads are the most commonly used concurrency tool. For example:

#include <thread>
#include <iostream>

using namespace std;

void func(int x)
{
    for (int i = 0; i < 50; i++) {
        // There is a bad mistake here. 
        // Access to the stdout ostream is not controlled.
        // Output will be erratic.
        cout << "Thread " << x << endl;
        // Take a thread nap.
        this_thread::sleep_for(chrono::milliseconds(50));
    }
}

int main(int argc, char *argv[])
{
    // Create two threads.
    // Thread constructor takes the callback function, followed by args to that function.
    thread t1( func, 1 );
    thread t2( func, 2 );
    thread t3( func, 3 );
    thread t4( func, 4 );

    // Wait for threads to finish execution before leaving the main() thread.
    t1.join();
    t2.join();
    t3.join();
    t4.join();

    return 0;
}

Output:

...
Thread 2
Thread 3
Thread Thread 4
Thread 1
2
Thread 3
Thread 1
Thread 4
Thread 3
Thread 2
Thread 4
...

Can you spot the bug? This code contains a synchronization issue with the cout ostream. Nugget fact: “cout” stands for “console out”. In this case, if the goal is to write cleanly to console, each thread should globally lock on the output stream before writing.

The C++ Programming Language – Part 3

This post will be mostly for personal reference as I go through Bjarne Stroustrop’s “The C++ Programming Language” 4th edition textbook. Some of the notes will appear random. This post will cover mostly the vector container. More containers and some algorithms will be featured in part 4. 

Chapter 4 – Containers and Algorithms

The C++ standard library provides standard types and takes up 2/3rds of the ISO C++ standard. It provides facilities for runtime support like allocations, the C standard library, strings, io streams, containers, algorithms, numerical computation, regex matching, concurrency, template programming, smart pointers, and more. In short, the C++ stdlib contains common fundamental data structures along with algorithms used on said data structures. The primary purpose of the C++ stdlib is to provide you with a well-designed and well-test tool set to solve problems in an efficient way. Each tool in the tool set utilizes C++ language features in near optimal/optimal ways.

Strings in C++ are mutable, as opposed to other languages like Python. You can utilize the [] operator to change a C++ String.

/* g++ -std=c++11 -o string string.cpp */

#include <stdio.h>
#include <iostream>
#include <string>

using namespace std;

string compose(const string& name, const string& domain)
{
    return name  + '@' + domain;
}

int main()
{
    auto addr = compose("yee", "badbytes.io");

    /* printf doesnt know what a C++ string is, so must pass C string */
    printf("%s\n", addr.c_str());

    /* cout ostream knows how to handle C++ string type */
    cout << addr[2] << endl;
}

Output:

user@ubuntu:~/cpp/part_1/chapter_4$ ./string
yee@badbytes.io
e

Input/output (io) of built-in types is straight forward. cout and cin from the iostream library can intake all built-in types. iostream also allows developers to define IO operations for their own user-defined types. For example:

/* g++ -std=c++11 -o entry entry.cpp */

#include <iostream>
#include <string>


using namespace std;


struct Entry {
    string name;
    int number;
};


/* user-defined type output */
ostream& operator<< (ostream& os, const Entry& e)
{
    return os << "{\"" << e.name << "\"," << e.number << "}" << endl;
}


/* user-defined type input */
istream& operator>> (istream& is, Entry& e)
{
    char c;
    char c2;
    if (is >> c && c == '{' && is >> c2 && c2 == '"') {
        string name;
        while (is.get(c) && c != '"') {
            name += c;
        }
        if (is >> c && c == ',') {
            int number = 0;
            if (is >> number >> c && c == '}') {
                e = {name,number};
                return is;
            }
        }
    }

    is.setstate(ios_base::failbit);
    return is;
}


int main()
{
    /*
    for (Entry ee; cin >> ee;) {
        cout << ee << endl;
    }
    */

    Entry ee;
    cin >> ee;
    cout << ee;
    return 0;
}

Output:

user@ubuntu:~/cpp/part_1/chapter_4$ ./entry 
{"badbytes", 1}
{"badbytes",1}

A container is a class with the main purpose of holding objects. The entry class shown earlier is a container. Vector is the most useful standard library container. It holds a sequence of elements of a given type and provides efficient ways to operate on the held objects. Vector stores elements contiguously in memory, giving it O(1) access time since you hold a starting pointer, then access elements by offset. Iterators are provided for every standard library container. They expose begin() and end() functions. Begin returns a pointer to the first element, end() points to one-past-the-end of the container. This can be a source of bugs, so be mindful when using end() while using a container iterator. Container iterators can also use unary operators like ++ and — to traverse the objects. While looping through a container with an iterator named iter, *iter is the current container element. If this pointer de-reference returns an object with members, then iter->member is valid. Using this syntax semantically the same as using (*iter)->member. Example vector initialization and usage:

/* g++ -std=c++11 -o vector_init vector_init.cpp */

#include <vector>
#include <iostream>

using namespace std;


int main()
{
    vector<int> v1 = {1, 2, 3, 4};
    vector<string> v2;
    vector<char> v3(23);
    vector<double> v4(8, 9.9);

    cout << "v1 size: " << v1.size() << endl;
    cout << "v1[2]: " << v1[2] << endl;

    cout << "v2 size: " << v2.size() << endl;

    cout << "v3 size: " << v3.size() << endl;
    cout << "v3 vector contents: " << endl;
    for (int i = 0; i != v3.size(); ++i) {
        cout << v3[i];
    }

    cout << "v4 size: " << v4.size() << endl;
    cout << "v4 vector contents: " << endl;
    for (int i = 0; i != v4.size(); ++i) {
        cout << v4[i] << " ";
    }
    cout << endl;

    return 0;
}

Output:

user@ubuntu:~/cpp/part_1/chapter_4$ ./vector_init 
v1 size: 4
v1[2]: 3
v2 size: 0
v3 size: 23
v3 vector contents: 
v4 size: 8
v4 vector contents: 
9.9 9.9 9.9 9.9 9.9 9.9 9.9 9.9

A second example using the vector container:

/* g++ -std=c++11 -ggdb -o vector_usage vector_usage.cpp */

#include <vector>
#include <string>
#include <iostream>


using namespace std;


void print_names(const vector<string>& names)
{
    for (int i = 0; i < names.size(); ++i) {
        cout << names[i] << endl;
    }
}


int main()
{
    vector<string> names = {"bad", "bytes", "yee"};
    print_names(names);
    return 0;
}

Output:

user@ubuntu:~/cpp/part_1/chapter_4$ ./vector_usage 
bad
bytes
yee

I stated earlier that stdlib vector stores elements contiguously in memory, giving it O(1) access time since you hold a starting pointer, then access elements by offset. Lets head into a debugger and make sure that the vector class is actually storing elements contiguously. If you want to replicate this, I suggest adding set print asm-demangle on in .gdbinit. This makes gdb show cleaner function names. For example,

<_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev@plt>

becomes

<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()@plt>.

Which lets be honest, still looks confusing, but its better. Disassembling main()

pwndbg> disas main
...  
   0x0000000000400e9c <+41>:	call   0x400cb0 <std::allocator<char>::allocator()@plt>
   0x0000000000400ea1 <+46>:	lea    rdx,[rbp-0xb4]
   0x0000000000400ea8 <+53>:	lea    rax,[rbp-0x90]
   0x0000000000400eaf <+60>:	mov    esi,0x401945
   0x0000000000400eb4 <+65>:	mov    rdi,rax
   0x0000000000400eb7 <+68>:	call   0x400ca0 <std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)@plt>
   0x0000000000400ebc <+73>:	lea    rax,[rbp-0xb3]
   0x0000000000400ec3 <+80>:	mov    rdi,rax
   0x0000000000400ec6 <+83>:	call   0x400cb0 <std::allocator<char>::allocator()@plt>
   0x0000000000400ecb <+88>:	lea    rax,[rbp-0xb3]
   0x0000000000400ed2 <+95>:	lea    rdx,[rbp-0x90]
   0x0000000000400ed9 <+102>:	lea    rcx,[rdx+0x20]
   0x0000000000400edd <+106>:	mov    rdx,rax
   0x0000000000400ee0 <+109>:	mov    esi,0x401949
   0x0000000000400ee5 <+114>:	mov    rdi,rcx
   0x0000000000400ee8 <+117>:	call   0x400ca0 <std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)@plt>
   0x0000000000400eed <+122>:	lea    rax,[rbp-0xb2]
   0x0000000000400ef4 <+129>:	mov    rdi,rax
   0x0000000000400ef7 <+132>:	call   0x400cb0 <std::allocator<char>::allocator()@plt>
   0x0000000000400efc <+137>:	lea    rax,[rbp-0xb2]
   0x0000000000400f03 <+144>:	lea    rdx,[rbp-0x90]
   0x0000000000400f0a <+151>:	lea    rcx,[rdx+0x40]
   0x0000000000400f0e <+155>:	mov    rdx,rax
   0x0000000000400f11 <+158>:	mov    esi,0x40194f
   0x0000000000400f16 <+163>:	mov    rdi,rcx
   0x0000000000400f19 <+166>:	call   0x400ca0 <std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)@plt>
   0x0000000000400f1e <+171>:	lea    rax,[rbp-0x90]
   0x0000000000400f25 <+178>:	mov    r12,rax
   0x0000000000400f28 <+181>:	mov    r13d,0x3
   0x0000000000400f2e <+187>:	lea    rax,[rbp-0xb1]
   0x0000000000400f35 <+194>:	mov    rdi,rax
   0x0000000000400f38 <+197>:	call   0x401174 <std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::allocator()>
   0x0000000000400f3d <+202>:	lea    rdx,[rbp-0xb1]


pwndbg> x/1s 0x401945
0x401945:	"bad"
pwndbg> x/1s 0x401949
0x401949:	"bytes"
pwndbg> x/1s 0x40194f
0x40194f:	"yee"

It looks like the 3 strings that I put into the vector container are initially in .rodata based on objdump.

user@ubuntu:~/cpp/part_1/chapter_4$ objdump -h vector_usage
...
15 .rodata       00000013  0000000000401940  0000000000401940  00001940  2**2
                 CONTENTS, ALLOC, LOAD, READONLY, DATA

All 3 strings are copied from read-only data onto main’s stack frame.

pwndbg> x/1s 0x7fffffffdc00
0x7fffffffdc00:	"bad"
pwndbg> x/1s 0x7fffffffdc20
0x7fffffffdc20:	"bytes"
pwndbg> x/1s 0x7fffffffdc40
0x7fffffffdc40:	"yee"

pwndbg> info proc mapping
...
            0x604000           0x636000    0x32000        0x0 [heap]
...
      0x7ffffffde000     0x7ffffffff000    0x21000        0x0 [stack]

I expected these strings to be placed on the heap by the container, but in this example they are on the stack. Next, the values in the iterator are looped over and printed, then the container is destructed. This confirms that the elements are actually stored consecutively allowing for O(1) access based on offset, but this use case is very simple and has tiny strings.

pwndbg> x/40s 0x7fffffffdc00
0x7fffffffdc00:	"bad"
0x7fffffffdc04:	""
0x7fffffffdc05:	""
0x7fffffffdc06:	""
0x7fffffffdc07:	""
0x7fffffffdc08:	"0\334\377\377\377\177"
0x7fffffffdc0f:	""
0x7fffffffdc10:	" \334\377\377\377\177"
0x7fffffffdc17:	""
0x7fffffffdc18:	"\005"
0x7fffffffdc1a:	""
0x7fffffffdc1b:	""
0x7fffffffdc1c:	""
0x7fffffffdc1d:	""
0x7fffffffdc1e:	""
0x7fffffffdc1f:	""
0x7fffffffdc20:	"bytes"
0x7fffffffdc26:	""
0x7fffffffdc27:	""
0x7fffffffdc28:	"\377\377"
0x7fffffffdc2b:	""
0x7fffffffdc2c:	"\001"
0x7fffffffdc2e:	""
0x7fffffffdc2f:	""
0x7fffffffdc30:	"@\334\377\377\377\177"
0x7fffffffdc37:	""
0x7fffffffdc38:	"\003"
0x7fffffffdc3a:	""
0x7fffffffdc3b:	""
0x7fffffffdc3c:	""
0x7fffffffdc3d:	""
0x7fffffffdc3e:	""
0x7fffffffdc3f:	""
0x7fffffffdc40:	"yee"
0x7fffffffdc44:	""
0x7fffffffdc45:	""
0x7fffffffdc46:	""
0x7fffffffdc47:	""
0x7fffffffdc48:	"\r\031@"

Let’s see what happens with strings 100,000 chars long:

/* g++ -std=c++11 -ggdb -o vector_usage vector_usage.cpp */

#include <vector>
#include <string>
#include <iostream>


using namespace std;


int main()
{

    string a(100000, 'A');
    string b(100000, 'B');
    string c(100000, 'C');

    vector<string> names = {a, b, c};

    return 0;
}

In this case, heap space is allocated by the container.

pwndbg> 
0x0000000000400c87	12	    string a(100000, 'A');
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
───────────────────────────────────────────────────────────────────────────────[ REGISTERS ]───────────────────────────────────────────────────────────────────────────────
 RAX  0x615c20 ◂— 0x4141414141414141 ('AAAAAAAA')

...

pwndbg> malloc_chunk a
0x615c20 PREV_INUSE {
  prev_size = 4702111234474983745, 
  size = 4702111234474983745, 
  fd = 0x4141414141414141, 
  bk = 0x4141414141414141, 
  fd_nextsize = 0x4141414141414141, 
  bk_nextsize = 0x4141414141414141
}

The same procedure occurs for variables b and c. Now to make sure they are contiguous:

pwndbg> malloc_chunk a
0x615c20 PREV_INUSE {
  prev_size = 4702111234474983745, 
  size = 4702111234474983745, 
  fd = 0x4141414141414141, 
  bk = 0x4141414141414141, 
  fd_nextsize = 0x4141414141414141, 
  bk_nextsize = 0x4141414141414141
}
pwndbg> malloc_chunk b
0x62e2d0 IS_MMAPED {
  prev_size = 4774451407313060418, 
  size = 4774451407313060418, 
  fd = 0x4242424242424242, 
  bk = 0x4242424242424242, 
  fd_nextsize = 0x4242424242424242, 
  bk_nextsize = 0x4242424242424242
}
pwndbg> malloc_chunk c
0x646980 PREV_INUSE IS_MMAPED {
  prev_size = 4846791580151137091, 
  size = 4846791580151137091, 
  fd = 0x4343434343434343, 
  bk = 0x4343434343434343, 
  fd_nextsize = 0x4343434343434343, 
  bk_nextsize = 0x4343434343434343
}

...

user@ubuntu:~/cpp/part_1/chapter_4$ python 
>>> hex(0x615c20 + 100000)
'0x62e2c0'
>>> hex(0x62e2d0 + 100000)
'0x646970'


...

pwndbg> x/100xg 0x62e2c0
0x62e2c0:	0x0000000000000000	0x00000000000186b1
0x62e2d0:	0x4242424242424242	0x4242424242424242
0x62e2e0:	0x4242424242424242	0x4242424242424242
0x62e2f0:	0x4242424242424242	0x4242424242424242
0x62e300:	0x4242424242424242	0x4242424242424242
pwndbg> x/100xg 0x646970
0x646970:	0x0000000000000000	0x00000000000186b1
0x646980:	0x4343434343434343	0x4343434343434343
0x646990:	0x4343434343434343	0x4343434343434343
0x6469a0:	0x4343434343434343	0x4343434343434343
0x6469b0:	0x4343434343434343	0x4343434343434343
0x6469c0:	0x4343434343434343	0x4343434343434343

And there they are, nice and contiguous in memory. A good next experiment may be to create wildly different sized vector members and see how it handles that. Since the heap tries to separate allocations based on size, I wonder how vector maintains contiguous members. It probably requests a block large enough to hold all its members regardless of their comparable size, so that a member of size 100 can still be contiguous to a member of size 100000. Otherwise, the OS memory allocator will most likely place those in different heap bins. I took a look at the vector source code but its too complex for the time I have. If you want to read it, see stl_vector.h. This was a bit of an aside, stay tuned for part 4.

 

 

The C++ Programming Language – Part 2

This post will be mostly for personal reference as I go through Bjarne Stroustrop’s “The C++ Programming Language” 4th edition textbook. Some of the notes will appear random.

Chapter 3 – Abstraction Mechanisms

Three types of classes will be discussed. Concrete classes, abstract classes, and classes in class hierarchies. A concrete class behaves like a built-in type. Concrete classes can be on the stack, on the heap, in static memory, in other objects, can be directly referenced, and more. They are essentially tangible. They are initialized on creation, and copy-able. Functions defined in concrete classes are inlined by default when possible. Their destructors are automatically called when the concrete object goes out of scope. An abstract class is quite different. The user sees zero implementation details, only knows how to use the interface. All abstract objects go on the heap because we can’t know the size ahead of time. Virtual and pure virtual functions are placeholders for implementation in subclasses. Abstract classes provide interfaces for all subclasses to override. This helps keep consistency across classes; enforces a model that subclasses must adhere to. In this way, users of the interface don’t need to know about subclass implementation, just the interface. The compiler converts the name of virtual functions into an index into a table of function pointers; the “virtual function table” or vtable. The vtable pointer is in memory immediately before the object instance. Classes in hierarchies are going to be explored in more detail in a different post. See SOLID design principles for more information.

A function object, aka Functor, is an object member that is callable. Yes, C++ can have callable objects. You need to define the () operator to make it callable.

/* A function object, aka Functor, is an object member that is callable */

template<typename T> class Less_than
{
    const T val;

    public:
        Less_than(const T& v) {
            val = v;
        }

        /* Define callable () */
        bool operator()(const T& x) const {
            return x < val;
        }
}

It is also a great idea to define copy constructor and copy assignment operator for an object. This way, you can explicitly define what happens during these operations. Otherwise, your code may not do exactly what you assume it does.

class Vector {

    private:

        double *elem;
        int size;

    public:

        Vector(int s);
        ~Vector() {
            delete[] elem;
        }

        /* copy constructor */
        Vector(const Vector& a)
        {
            /* create a new area of memory for the new object's elem array */
            elem = new double[a.size];
            size = a.size;

            /* copy elements */
            for (int i = 0; i != size; i++) {
                elem[i] = a.elem[i];
            }
        }

        /* copy assignment */
        Vector& operator=(const Vector& a);

        double& operator[](int i);
        const double& operator[](int i) const;

        int size() const;

};

Abstract class example with pure virtual functions. Utilizing “= 0;” makes a function pure virtual. If a function in a class is pure virtual, the class is now defined as abstract. Pure virtual functions must be implemented by subclasses.

/* Example of a virtual / abstract class with pure virtual functions */

class Container
{
    public:
        virtual double& operator[](int) = 0;
        virtual int size() const = 0;
        virtual ~Container() {}
}

Finally, a typical user-defined class that behaves like a built-in type. This is a concrete class.

/* A classical user-defined type example */

class Complex
{
    double re;
    double im;

    public:

        /* Constructors */
        Complex(double r, double i) {
            re = r;
            im = i;
        }

        Complex(double r) {
            re = r;
            im = 0;
        }

        Complex() {
            re = 0;
            im = 0;
        }

        /* Accessors and mutators
         * const indicates that we will not modify the object for which they are called on
         */
        double real() const {
            return re;
        }

        void real(double d) {
            re = f;
        }

        double imag() const {
            return im;
        }

        void imag(double d) {
            im = d;
        }

        
        /* Operator definitions */
        Complex& operator+=(Complex z) {
            re += z.re;
            im += z.im;
            return *this;
        }

        Complex& operator-=(Complex z) {
            re -= z.re;
            im -= z.im;
            return *this;
        }


        /* Destructor */
        ~Complex() {
            // cleanup if stuff is on the heap
        }
}

Classes can also be parameterized; that is, the type can be determined at compile time. In this example, a Vector can be a container for any type, represented by T in the implementation. Templates are resolved at compile time, so there is no performance overhead at runtime.

/* Allow for parameterization of class */

template<typename T> class Vector {

    private:
        T* elem;
        int size;

    public:
        Vector(int s);
        T& operator[](int i);
        const T& operator[](int i) const;
}

 

The C++ Programming Language – Part 1

This post will be mostly for personal reference as I go through Bjarne Stroustrop’s “The C++ Programming Language” 4th edition textbook. Some of the notes will appear random.

Introduction

The purpose of a programming language is to express ideas in code. It provides a vehicle for development and execution, and provides a set of concepts for problem solving. C++ specifically provides direct mappings of built-in operations and types to hardware for efficiency and speed. It also provides abstraction mechanisms for user-defined type creation with the same support and performance as built-in types. The overall target is efficiency and elegance. C++ was originally the convergence of Simula and C.

C++ most directly supports 3 programming styles. First, procedural programming: processing and design of suitable data structures for the problem at hand. Second, object oriented: class hierarchies, runtime polymorphism. Lastly, generic programming: algorithms that accept N types, template programming. Most of these paradigms are built around data abstraction; hiding implementation details behind exposed interfaces.

The concept of lvalues and rvalues are crucial in C++. For example, int x = 5. lvalue of ‘x’ is reliable, it is located in memory at a trustworthy position. rvalue is an expression that does not represent an object occupying some identifiable location in memory. The difference between C and C++ is primarily the degree of emphasis on types and structure. C++ is a language you can grow with.

Lastly, how can you write good code in C++ (or any language)? Know what you want to express and practice (imitate good code).

Chapter 2

C++ is a statically typed language. The compiler can only determine the set of operations applicable to an entity by its type. the ISO C++ standard defines two things: core language features and standard library components. What is a declaration? A statement that introduces a name into the program and specifies a type for the name entity. A type defines the set of possible values and a set of operations for the entity. An object is some memory that holds a value of a type. A value is a set of bits interpreted to a type. A variable is a named object. C++ performs all meaningful conversions between basic types, but here be dragons for those unaware. Utilize {} during initialization to avoid conversion issues. Use . “dot” to access struct members through a name (reference), and -> “arrow” to access struct members through a pointer. For example:

void f(Vector v, Vector &rv, Vector *pv)
{
    int i1 = v.sz;
    int i2 = rv.sz;
    int i3 = pv->sz;
}

A feature in C++11, enum classes make enumerations both “strongly typed and strongly scoped.” For example, an enum class called TrafficLight containing the 3 states of a light, green, yellow, and red:

// g++ -std=c++11 -o enum_classes enum_classes.cpp

#include <iostream>
using namespace std;

enum class TrafficLight { green, 
                          yellow, 
                          red };

TrafficLight& operator++(TrafficLight& t)
{
    switch(t) {
        case TrafficLight::green:
            return t=TrafficLight::yellow;

        case TrafficLight::yellow:
            return t=TrafficLight::red;

        case TrafficLight::red:
            return t=TrafficLight::green;

        }
}

/* define the << operator for cout and TrafficLight! */
std::ostream& operator<<(std::ostream& out, TrafficLight t)
{
    switch(t)
    {
        case TrafficLight::green:
            out << "green";
            break;
        case TrafficLight::yellow:
            out << "yellow";
            break;
        case TrafficLight::red:
            out << "red";
            break;
        default:
            out << "wat";
            break;
    }

    return out;
}


int main()
{
    int light_cycle = 0;
    TrafficLight light = TrafficLight::red;

    while(light_cycle < 3) {
        cout << "Traffic light is " << light << "\n";
        ++light;
        light_cycle++;
    }
   
    return 0;
}

Output:

user@ubuntu:~/cpp/part_1/chapter_2$ ./enum_classes
Traffic light is red
Traffic light is green
Traffic light is yellow

Using the auto keyword allows a type to be determined at compile time. This is helpful if return types may change, allowing code with auto to avoid an update. The obvious trade-off to this convenience is loss of developer insight; he won’t know what type is being returned without looking at more code. Said a more formal way: “For variables, specifies that the type of the variable that is being declared will be automatically deduced from its initializer. For functions, specifies that the return type is a trailing return type or will be deduced from its return statements (since C++14) for non-type template parameters, specifies that the type will be deduced from the argument(since C++17).”:

#include <iostream>
using namespace std;


int main()
{
    /* types will be automatically inferred. 
     * this can help to avoid long type names, cluttering code. */
    auto b = true;
    auto ch = 'x';
    auto i = 123;
    auto d = 1.2;
    auto z = sqrt(y);

}

Const and constexprs confounded me at first. This example shows their use and I am 95% confident in the comments:

#include <iostream>
using namespace std;


double square(double x)
{
    return x*x;
}

int main()
{
    /* const: i promise not to change this value */
    const int dmv = 17; // named constant in RO memory
    int var = 17; // not a constant, goes on the stack

    /* constexpr: to be evaluated at compile time. place data in read-only memory.  */
    constexpr double max1 = 1.4*square(dmv); // ok if square(17) is a constant expression
    constexpr double max2 = 1.4*square(var); // var is not a constant expression, error.
    const double max3 = 1.4*square(var); // evaluated at runtime
}

The output of a compilation describes why the use of constexpr is incorrect in this instance. Note that this feature is specific to C++11.

user@ubuntu:~/cpp/part_1/chapter_2$ g++ -std=c++11 -o immutability immutability.cpp 
immutability.cpp: In function ‘int main()’:
immutability.cpp:17:39: error: call to non-constexpr function ‘double square(double)’
     constexpr double max1 = 1.4*square(dmv); // ok if square(17) is a constant expression
                                       ^
immutability.cpp:18:39: error: call to non-constexpr function ‘double square(double)’
     constexpr double max2 = 1.4*square(var); // var is not a constant expression, error.

Initialization in C++ can be done in many ways. Here are a few interesting examples. I note DRAGONS because type conversion issues can silently occur if you aren’t very careful.

#include <iostream>
using namespace std;


int main()
{
    double d_1 = 2.3;
    double d_2 {2.3};

    complex<double> z = 1;
    complex<double> z_2 {d1, d2};
    /* = is optional with {} initialization */
    complex<double> z_3 = {1, 2};

    vector<int> v {1, 2, 3, 4, 5, 6};

    /* ~DRAGONS~ */
    int i_1 = 7.2;  /* i_1 is 7 */
    int i_2 {7.2};  /* error */
    int i_3 = {7.2} /* error */

}

An invariant is something that is assumed to be true about a class. For example, “elem points to an array of ‘size’ doubles.” Constructors must establish invariants. This is very important because it makes us understand exactly what we want and it forces us to be specific.

Vector::Vector(int s)
{
    if (s < 0) {
        throw length_error{"Vector constructor size"};
    }
    elem = new double[s];
    sz = s;
}


/* Using the constructor: */
void test()
{
    try {
        Vector v(-27);
    }
    catch (std::length_error) {
        // ...
    }
    catch (std:bad_alloc) {
        // couldnt get heap space
    }
}

A namespace is a mechanism for expressing that some declarations belong together. Names shouldn’t clash with other names. In this example, a class is created named complex. We need to separate namespaces because complex already exists in the standard namespace.

namespace My_code
{
    class complex { /*...*/ }
    complex sqrt(complex);
    // ...
    int main();
}


int My_code::main()
{
    complex z {1, 2};
    auto z1 = sqrt(z);
    std::cout << "{" << z2.real() << "," << z2.imag() << "}\n";
    // ...
}


int main()
{
    return My_code::main();
}

C++ structs are straightforward. They provide a easy to use mechanism for encapsulating data. They can have methods as well. All data and methods in a C++ struct default to public. This contrasts with C++ classes which all data and methods default to private. This is key to the practice of object oriented design.

struct Vector {
    int sz;
    double *elem;
}

Vector v;

void vector_init(Vector &v, int s)
{
    v.elem = new double[s];
    v.sz = s;
}

Exceptions are provided mainly through the type system. Here is an example of error handling for out-of-bounds vector access. This at least makes the developer aware that strange/bad things are happening.

// g++ -O0 -ggdb -std=c++11 -o Vector_error_handling Vector_error_handling.cpp
/* Vector implementation with newly added error handling 
 * C++ provides error handling mainly through the Type system */

#include "Vector.h"
#include <iostream>
#include <stdexcept>

using namespace std;

Vector::Vector(int s)
{
    elem = new double[s];
    sz = s;
}


/* For example, lets ensure no out of bounds access
 * or at least make the user aware that it is happening */
double& Vector::operator[](int i)
{
    if (i < 0 || size() <= i) {
        /* throw an exception and hope the library code user
         * has implemented an exception handler
         * note: out_of_range type is defined in <stdexcept> */
        throw out_of_range{"Vector::operator[]"};
    }
    return elem[i];
}


int Vector::size() const
{
    return sz;
}

int main()
{
    Vector v(1000);
    double a = v[1]; // ok
    double b = v[2000]; //oob
    return 0;
}

Output:

user@ubuntu:~/cpp/part_1/chapter_2$ ./Vector_error_handling 
terminate called after throwing an instance of 'std::out_of_range'
  what():  Vector::operator[]
Aborted

 

C++ Binary Internals

The idea for this post started out as an exploration of compiler optimizations in GCC. It turned into a static and dynamic C++ class exploration exercise. The piece of code we’ll look at:

// file: bin1.hpp
#include 
using namespace std;

class FirstClass
{
    public:
        FirstClass(int new_id) { id = new_id; };
        int get_id() const { return id; };
    private:
        int id;
};

Continue reading C++ Binary Internals