Untitled Document

Introduction to STL

The STL has six kinds of components:

containers
iterators
generic algorithms
function objects
adaptors
allocators.

All of these are provided via C++ templates. Containers are such things as expandable arrays, called vectors, lists, and sets. Ordinary arrays
of C++ act like containers. The STL also provides Deques, maps (dictionaries), multi-sets, and multi-maps. Lists are necessarily
doubly linked, but very little is specified about the actual implementations. Sets, multi-sets, maps, and multi-maps are always kept
sorted.

Iterators generalize C++ pointers and provide a means of manipulating containers. Unlike other "object-oriented" libraries, in which the
algorithms are provided within the classes of things they manipulate, in the STL the algorithms are nearly all provided externally to the
container classes. Iterators are the interface between the containers and the algorithms that manipulate them. This permits an extremely high
degree of reuse as the same sort algorithm will sort ordinary C++ arrays and STL vectors, among other things.

Function objects are used to tailor some algorithms to special needs. A function object is anything that supports operator(). This actually
includes ordinary functions, but also objects whose classes define this operator. For example, the sort algorithm mentioned above uses
operator< by default when comparing objects. This is, of course, not appropriate when comparing ordinary (char*) strings. One may therefore provide a function object that provides this "less than" comparison in operator(). Such a function object may be passed to sort and will be used in place of operator<. A function object may also be used to reverse the sort order.

Adaptors adapt other components to special purposes. For example, the stack container adaptor will "adapt" a vector, list, or deque so that
its interface is that of a stack. There are also iterator adaptors that make iterators behave differently, by iterating backwards, for example.

Allocators encapsulate a memory model. This is useful in machines (IBM PC's) that have several memory models available. They decouple the algorithms from assumptions about a particular model.

The good news about the STL is that it provides a lot of functionality, with very good performance at low run-time space cost. Generalized
algorithms are only provided when their efficiency is good. In other cases a specialized algorithm is provided as well. An example here is
sorting. The generalized sort algorithm could have been adapted to work on lists, but its efficiency would have been poor. Instead, a special
list sort that is optimized for sorting lists is also provided . Care has also been taken in the design so that the problem of "code bloat" is
avoided. Separation of algorithms from containers means that instantiations of the templates result in relatively little added code
and generally only that which is actually going to be used.

The bad news is that the STL provides few programmer comforts. In particular, programming with the STL does not increase the safety of
programs (other than from the fact that the STL code is well documented and tested). Most of the interfacing is done with iterators. Iterators
have all of the efficiency and all of the drawbacks of pointers, which they generalize. The programmer needs to take all necessary care with
subscript bounds, dangling references, deallocation issues, etc. The vector class supports operator[] and subscripts are NOT checked. As is
usual in C and C++ safety/efficiency tradeoffs are always resolved in favor of efficiency. This is ameliorated by the fact that the situation
is no worse with iterators than it is with pointers, and the same techniques that you have learned for pointers apply to iterators. If
you are thoroughly familiar with the relationship between arrays and pointers in C++, including the pointer duality law, then you should have
no trouble in successfully programming with the STL.

The designers of the STL (Alexaner Stepanov and David Musser, primarily) took great care so that the algorithms work in many situations. For example the algorithms work on containers containing constants as well as on updatable values. There is also a novel mechanism for passing type information between algorithms so that local variables can be allocated within the algorithms as needed. This is at the
implementation level, however, and so doesn't get in the way of learning, unless you try to learn by reading the code itself.

The degree of interaction between the components of the STL is quite amazing, resulting in great power, but a somewhat daunting learning
task. For example, there are five kinds of iterators ( input, output, forward, bi-directional, and random access ) plus variations such as
reverse iterators and iostream iterators. Iterators are not a type, however. There is no iterator class from which other iterators are
derived. An object or other value is an iterator if it possesses an iterator interface. For example operator++ is required for a forward
iterator (along with other requirements). Bi-directional iterators also support operator--. Random access iterators support these and operator+ and operator- as well, permitting "iterator arithmetic." The advantage of this approach is that ordinary pointers are automatically random access iterators, which means that all of the algorithms of the STL work with ordinary arrays as well as the containers specifically provided by the STL itself. The design also means that iterators work with for loops in the same way that pointers do.

Iterators are incorporated into algorithms via the template mechanism. The sort routine, for example is defined by a function template whose
argument is a random access iterator. Each type of container exports an iterator type for use with the algorithms. For example, lists provide
bi-directional iterators, and vectors (expandable arrays) provide random access iterators. Each container also provides standard members for
obtaining an iterator to its beginning and one to its end. By repeatedly applying operator++ to the begin() iterator you will eventually reach
the end() iterator. Specifically, the end iterator refers to a "end location" after the last element in the container, and not to the last
element of the container. These two iterators (begin and end) are defined even for containers that do not have a linear structure, such as
sets.

The implementation of the containers and algorithms of the STL is not specified in the standard, but the efficiency of each algorithm is
specified. This efficiency is somewhat contingent, however, on the user playing by the rules. For example, all function objects are supposed to execute operator() in constant time. If they do, then sort will have order N(log(N)) running time. Some algorithms specify one degree of
efficiency in all cases and a better degree if additional work space can be assured. Many of the running time efficiencies are stated as average
or amortized. For example insertions at the end of a vector (which might result in its growing in size) are amortized constant time, though
an individual such insertion might be linear in the current size of the vector.

A very simple example of the use of the STL follows.

vector<int> vec;
int val;

while ( cin >> val)
vec.push_back(val);
// push_back inserts at the end of the vector.

sort(vec.begin(), vec.end());

for(vector<int>::iterator i = vec.begin(); i != vec.end(); ++i)
cout << vec[i];

If we desire that the sort be in decreasing order we can use the
following instead.

bool greater(int a, int b){ return a > b; }
. . .
sort(vec.begin(), vec.end(), greater );

Notice that the iterator type is abstract here and is exported from the vector<int> class. It acts just like a pointer. The ordinary function
greater will be taken as a function object. The generality of passing functions or objects implementing operator() interchangeably is achieved
by making the type of the third argument of sort a template parameter. This means that it is typed by its functionality, not by its place in a
particular type hierarchy.

By Joseph Bergin