Type erasures

Templates are powerful tools as they allow the compiler to perform all kinds of optimizations.In addition they help to avoid virtual functions in classes and thus increase performance by avoiding call indirection through the virtual functions table. However, there are two major obstacles with templates

  • template expansion virtually always leas to code generation and this could lead to large

binaries which might be a problem on small hardware architectures * template libraries and the applications which are using them are harder to maintain.

The last point may requires a bit of explanation. The reason why system administrators are not very happy with programs based on template libraries is that the latter ones are distributed as source code. Consequently whenever a bug is fixed in the library all programs depending on the code required recompilation. For programs using binary libraries only the library has to be updated. This is obviously much easier than recompiling all the programs depending on a library.

A reasonable solution for this problem is the use of type erasures. libpnicore provides three different type erasures

class description
value stores a single scalar value of a POD type
value_ref stores the reference to an instance of a POD type
array stores a multidimensional array type

To use type erasures include the /pni/core/type_erasures.hpp at the top of your source file.

The value type erasures

Construction

The value type erasure stores the value of a single primitive type. Whenever an instance of value is constructed memory is allocated large enough to store the value of a particular type.

value provides a default constructor. The instance produced by the default constructor holds a value of type none.

value v;
std::cout<<v.type_id()<<std::endl; //output NONE

Though there is not too much one can do with such a type it has the nice advantage that one can default construct an instance of type value. In addition a copy and a move constructor is provided. All these constructors are implicit.

The more interesting constructors are explicit. An instance of value can be constructed either from a variable from a particular type or from a literal as shown in this next example

//explicit construction from a variable
int32 n = 1000;
value v1(n);
std::cout<<v1.type_id()<<std::endl; //output INT32

//explicit construction from a literal
value v2(3.4212);
std::cout<<v2.type_id()<<std::endl; //output FLOAT64

//copy construction
value v3 = v1;

As mentioned earlier in this section, whenever an instance of value is constructed, memory is allocated to store the quantity that should be hidden in the type erasure. The default constructor would allocate memory for a none type with which one can do nothing useful. A typical application for type erasures would be to store primitive values of different type in a container and we would like to make the decision which type to use at runtime. For this purpose one could define a vector type like this

using value_vector =  std::vector<value>;

However, how would one initialize an instance of this vector? It would not make too much sense to use the default constructor (as we cannot pass type information). The solution to this problem is the make_value() function which comes in two flavors. The first, as shown in the next code snippet, takes a type ID as a single argument and returns an instance of value of the requested type.

std::vector<type_id_t>  ids = get_ids();
value_vector values;

for(auto id: ids)
    values.push_back(make_value(id));

In addition there is a function template which serves the same purpose

value v = make_value<uint32>();

Here the type is determined by the template parameter of the function template.

Assignment

Copy and move assignment are provided by the value between two of its instances. In both situations the type of the value instance on the left handside of the operator changes (this is obvious). Move and copy assignment have the expected semantics.

The more interesting situation appears with assigning new values to an instance of value. As memory is only allocated during creation (or copy assignment) assigning a new value does not create a new instance of value but rather tries to perform a type conversion between the instance of value on the LHS of the operator and the value on the LHS.

value v = make_value<float32>(); //creates a value for a float32 value

v = uint16(5); //converts uint16 value to a float32 value

The type conversion follows the same rules as described in the section about type conversion earlier in this manual (in fact it uses this functionality). Consequently

value v = make_value<float64>();

v = complex32(3,4); //throws type_error

will throw a type_error exception as a complex number cannot be converted to a single float value.

Retrieving data

Retrieving data from an instance of value is done via the as() template method like this

value v = ....;

auto data = v.as<uint8>();

The template parameter of as() determines the data type as which the data should be retrieved. Like for value assignment the method performs a type conversion if necessary and throws type_error or range_error exceptions if the conversion is not possible or the numeric range of the requested type is too small.

Information about the type of the data stored in the:cpp:class:value instance can be obtained by means of the type_id method.

value v = ...;
v.type_id();

The value_ref type erasure

The value type encapsulates data of an arbitrary type and has full ownership of the data. Sometimes it is more feasible to only store a reference to an already existing data item of a primitive type. If the reference should be copyable the default approach towards this problem would be to use std::reference_wrapper. Unfortunately, this template includes the full type information – which is what we want to get rid of when using a type erasure. libpnicore for this purpose provides the value_ref erasure. It stores a reference to an existing data item and hides all the type information. Though value_ref behaves quite similar to value there are some subtle differences originating from its nature as a reference type. Thus it is highly recommended to read this section carefully if you are planing to use value_ref.

Construction

Like value, value_ref is default constructible

value_ref vref;

allowing it to be used in STL containers. However, unlike value the default constructed reference points to nowhere. Every access to any of value_ref’s methods will throw memory_not_allocated_error for a default constructed instance of value_ref. The preferred way of how to initialize value_ref is by passing an instance of std::reference_wrapper to it

float64 data;
value_ref data_ref(std::ref(data));

In addition value_ref is copy constructible.

Assignment

The most difficult operation with value_ref is assignment. It really depends on the right handside of the assignment operator what happens. One can do copy assignment

float32 temperature;
uint32  counter;
value_ref v1(std::ref(temperature)); //reference to temperature
value_ref v2(std::ref(counter));     //reference to counter

v1 = v2; //now v1 is a reference to counter too

which has the same semantics as the copy assignment for std::reference_wrapper where the reference is copied.

Another possibility is to assign the value of a primitive type to an instance of value_ref. In this case two things are taking place

  • the value is converted to the type of the data item the instance of value_ref references
  • the converted value is assigned to the referenced data item

Consider this example

float32 temperature;
value_ref temp_ref(std::ref(temperature));

temp_ref = uint16(12);

In this example the value 12 of type uint16 is first converted to a float32 value. This new float value is then assigned to the variable temperature. As always with type conversions exceptions will be thrown if the conversion fails.

One can also change the variable an instance of value_ref references with

value_ref ref = ....;    //reference to some data item
complex64 refractive_index = ...;

//now reference points to refractive_index
ref = std::ref(refractive_index);

Finally a value from a value instance can be assigned with

value v = int32(100);
value_ref ref = ....;

ref = v;

in which case type conversion from the internal type of v to the internal type of ref occurs. Exceptions are thrown if the type conversion fails.

Retrieving data

Data retrieval for value_ref works exactly the same way as for value. The type provides a template method as() which can be used to get a copy of the data stored in the item referenced as an instance of a type determined by the template parameter.

value_ref ref = ....;

auto data = ref.as<uint32>();

Again, type conversion takes place from the original type of the referenced data item to the type requested by the user via the template parameter. Finally, as value, value_ref provides a type_id() member function which returns the type ID of the referenced data item.

Type erasures for arrays

As libpnicore provides a virtually indefinite number of array types via its mdarray template the array type erasure is maybe one of the most important ones. Like the value type erasure it will take over full ownership of the array stored in it.

A good introduction into the array type erasure is this particular version of the array inquiry example from the previous chapter on arrays.

Listing 11 see examples/type_erasure3.cpp for full code
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <vector>
#include <pni/core/types.hpp>
#include <pni/core/arrays.hpp>
#include <pni/core/type_erasures.hpp>

using namespace pni::core;

//some usefull type definitions
typedef dynamic_array<float64> darray_type;
typedef static_array<float64,3,3> sarray_type;
typedef fixed_dim_array<float64,2> farray_type;

void show_info(const array &a)
{
    std::cout<<"Data type: "<<type_id(a)<<std::endl;
    std::cout<<"Rank     : "<<a.rank()<<std::endl;
    std::cout<<"Shape    : (";
    auto s = a.shape<shape_t>();
    for(auto n: s) std::cout<<" "<<n<<" ";
    std::cout<<")"<<std::endl;
    std::cout<<"Size     : "<<a.size()<<std::endl;
}

int main(int ,char **)
{
    auto a1 = darray_type::create(shape_t{1024,2048});
    auto a2 = farray_type::create(shape_t{1024,2048});
    sarray_type a3;

    std::cout<<"--------------------------------"<<std::endl;
    show_info(array(a1));
    std::cout<<std::endl<<"--------------------------------"<<std::endl;
    show_info(array(a2));
    std::cout<<std::endl<<"--------------------------------"<<std::endl;
    show_info(array(a3));

    return 0;
}

In the previous version, where show_info() was a template function a new version of show_info() would have been created for each of the three array types used in this example. By using the type erasure only a single version of show_info() is required which reduces the total code size of the binary.

The current implementation of array is rather limited in comparison to the mdarray template. Multidimensional access is not provided and only forward iteration is implemented. In addition there is now array_ref type erasure which only keeps a reference to an instance of mdarray.

The iterators themselves have a subtle speciality. They do not provide a ->() operator. This has a rather simple reason. While all other interators return a pointer to a particular data element in a container the array iterators cannot do this (they do no hold any type information). Instead they return an instance of value for constant or value_ref for read/write iterators. In order to keep the semantics of the ->() operator we would have to return *value or *value_ref from the ->() operator. However, this is not possible as these objects are just temporaries and would be destroyed once the operator function has returned. However, this is only a small inconvenience as it has no influence on the STL compliance of the iterator. One can still use the for-each construction

array a(...);

for(auto x: a)
    std::cout<<s<<std::endl;

and all STL algorithms with a array type erasure.

An example: reading tabular ASCII data

In this final section a typical use-case for a type erasure will be discussed. One problem that regularly pops up is to read tabular ASCII data. For this example a very simple file format has been used. The file record.dat has the following content

11  -123.23  (-1.,0.23)
13  -12.343  (12.23,-0.2)
16  134.12   (1.23,-12.23)

While the elements of the first two columns are integer and float respectively, the third column holds complex numbers. The task is simple: read the values from the file without losing information. This means that we do not want to truncate values (for instance float to integer) or do inappropriate type conversions (for instance convert everything to the complex type) which may add rounding errors.

There are several ways how to approach this problem. The most straight forward one would be to create a struct with an integer, a float, and a complex element. However, this approach is rather static. If a column will be added or removed or only the order of the columns is changed we have to alter the code.

In this example a different path has been taken. Each individual line is represented by a record type which consists of a vector whose elements are instances of the value type erasure.

Listing 12 see examples/type_erasure_record.cpp for full code
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <vector>
#include <iostream>
#include <fstream>
#include <pni/core/types.hpp>
#include <pni/core/type_erasures.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

using namespace pni::core;
using namespace boost::spirit;

typedef int32                    int_type;
typedef float64                  float_type;
typedef complex64                complex_type;
typedef std::vector<value>       record_type;
typedef std::vector<record_type> table_type;

The entire table is again a vector with record_type as element type. In addition we have defined a special type to store complex numbers (complex_type).

Defining the parsers

One of the key elements for this example is to use the boost::spirit parser framework. We define three parsers

  • one for the complex_type
  • one for a value which can parser integer, double, and complex numbers
  • and one for the entire record.

The boost::spirit framwork is indeed rather complex and requires a deep understanding of some of the additional boost libraries like fusion and phoenix. However, as we will see, it is worth to become familiar with them as will be shown here.

In this next snippet the definition of the complex number parser is shown.

Listing 13 see examples/type_erasure_record.cpp for full code
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
template<typename ITERT>
struct complex_parser : public qi::grammar<ITERT,complex_type()>
{
    qi::rule<ITERT,complex_type()> complex_rule;

    complex_parser() : complex_parser::base_type(complex_rule)
    {
        using namespace boost::fusion;
        using namespace boost::phoenix;
        using qi::_1;
        using qi::_2;
        using qi::double_;
        
        complex_rule = ('('>>double_>>','>>double_>>')')
                        [_val = construct<complex_type>(_1,_2)];
    }
};

We assume complex numbers to be stored as tuples of the form (real part,imaginary part). As we can see in the above example the complex type is assembled from the two double values matched in the rule. The next parser required is the value parser. This parser matches either an integer, a double, or a complex value. It is a good example how to reuse already existing parser in boost::spirit.

Listing 14 see examples/type_erasure_record.cpp for full code
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
template<typename ITERT>
struct value_parser : public qi::grammar<ITERT,pni::core::value()>
{
    qi::rule<ITERT,pni::core::value()> value_rule;

    complex_parser<ITERT> complex_;

    value_parser() : value_parser::base_type(value_rule)
    {
        using namespace boost::fusion;
        using namespace boost::phoenix;
        using qi::_1;
        using qi::char_;
        using qi::int_;
        using qi::double_;
        using qi::_val;

        value_rule = (
                     (int_ >> !(char_('.')|char_('e')))[_val =
                     construct<pni::core::value>(_1)]
                     || 
                     double_[_val = construct<pni::core::value>(_1)]
                     ||
                     complex_[_val = construct<pni::core::value>(_1)]
                     );
    }
};

Finally we need a parser for the entire record. This is rather simple as boost::spirit provides a special syntax for parsers who store their results in containers.

Listing 15 see examples/type_erasure_record.cpp for full code
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
template<typename ITERT>
struct record_parser : public qi::grammar<ITERT,record_type()>
{
    qi::rule<ITERT,record_type()> record_rule;

    value_parser<ITERT> value_;

    record_parser() : record_parser::base_type(record_rule)
    {
        using qi::blank;

        record_rule = value_ % (*blank);
    }
};

The main program

The main program is rather simple

Listing 16 see examples/type_erasure_record.cpp for full code
195
196
197
198
199
200
201
202
203
int main(int ,char **)
{
    std::cout<<"File: record.dat"<<std::endl;
    file_to_stream(std::cout,"record.dat");
    std::cout<<std::endl<<"File: record2.dat"<<std::endl;
    file_to_stream(std::cout,"record2.dat");

    return 0;
}

Not all the code will be explained as it is only those parts which are of interest for the value type erasure. The program can be divided into two parts:

  • reading the data (in line 132)
  • and writing it back to standard output (in line 148)

As the latter one is rather trivial we will only consider the reading part in this document. The output of the main function is

INT32
FLOAT64
COMPLEX32
11      -123.23 (-1,0.23)
13      -12.343 (12.23,-0.2)
16      134.12  (1.23,-12.23)

The reading sequence

The entry point for the read sequence is the read_table() function.

Listing 17 see examples/type_erasure_record.cpp for full code
132
133
134
135
136
137
138
139
140
141
142
143
144
145
table_type read_table(std::istream &stream)
{
    table_type table;
    string line;

    while(!stream.eof())
    {
        std::getline(stream,line);
        if(!line.empty())
            table.push_back(parse_record(line));
    }

    return table;
}

The logic of this function is rather straight forward. Individual lines are written from the input stream until EOF and passed on to the parse_record() function which returns an instance of record_type. Each record is appended to the table.

The parse_record() function is where all the magic happens

Listing 18 see examples/type_erasure_record.cpp for full code
116
117
118
119
120
121
122
123
124
125
126
127
record_type parse_record(const string &line)
{
    typedef string::const_iterator iterator_type;
    typedef record_parser<iterator_type> parser_type;

    parser_type parser;
    record_type record;

    qi::parse(line.begin(),line.end(),parser,record);

    return record;
}

The definition of this function pretty much demonstrates the power of the boost::spirit library. All the nasty parsing work is done by the code provided by boost::spirit. The only thing left to do is provide iterators to the beginning and end of the line.