FASTreams


What is FASTreams?
Hello World!
Formatters ready for use
Using lists of formatters
Writers ready for use
Readers ready for use
Adapters
Useful functions and definitions
Iterator support
Lexical casts
Run-time polymorphism
Performance
Architecture and Concepts
Known problems

Download FASTreams:

save
fastreams-1.0.3.tar.gz (26 kB)
save
fastreams-1.0.3.zip (27 kB)

What is FASTreams?

FASTreams is an I/O stream-like library for C++, intended to be a really fast replacement for standard IOStreams.
The idea to write FASTreams was driven by the fact that standard IOStreams library is most often used in a way that does not justify the run-time costs of its internal mechanisms.

Let's say that 90% of the time programmers use only 10% of the IOStreams' functionality, but they always have to pay 100% of its price.

FASTreams are written in a way that promotes very aggressive code inlining in current compilers (tested on g++ 3.4.2) - dynamic binding is avoided and static compile-time techniques are used intensively to achieve the maximum performance.

At the moment, FASTreams should be treated as an experimental library, which is rather a feasibility study than an industry-strong solution (well, IOStreams is...).
However, it works in those 90% cases mentioned above - and does it fast - see
performance test results.

Hello World!

#include "fastreams.h"

using namespace fastreams;

int main()
{
    cfile_ofastream out(stdout);

    out << "Hello World!\n";
}

Formatters ready for use

The following formatters are provided (all in namespace fastreams):

Formatter type for writing for reading what type it formats requires
char_formatter Yes Yes single char -
charp_formatter Yes Yes C-style string Rewindable reader for reading
string_formatter Yes Yes std::string Rewindable reader for reading
long_formatter Yes Yes long Rewindable reader for reading
ulong_formatter Yes Yes unsigned long Rewindable reader for reading
int_formatter Yes Yes int Rewindable reader for reading
uint_formatter Yes Yes unsigned int Rewindable reader for reading
short_formatter Yes Yes short Rewindable reader for reading
ushort_formatter Yes Yes unsigned short Rewindable reader for reading
bool_formatter Yes Yes bool Rewindable reader for reading
double_formatter Yes Yes double Rewindable reader for reading
float_formatter Yes Yes float Rewindable reader for reading
char_printf_formatter Yes No single char Printf-format storing writer
charp_printf_formatter Yes No C-style string Printf-format storing writer
string_printf_formatter Yes No std::string Printf-format storing writer
long_printf_formatter Yes No long Printf-format storing writer
ulong_printf_formatter Yes No unsigned long Printf-format storing writer
int_printf_formatter Yes No int Printf-format storing writer
uint_printf_formatter Yes No unsigned int Printf-format storing writer
short_printf_formatter Yes No short Printf-format storing writer
ushort_printf_formatter Yes No unsigned short Printf-format storing writer
double_printf_formatter Yes No double Printf-format storing writer
float_printf_formatter Yes No float Printf-format storing writer
voidp_printf_formatter Yes No void * Printf-format storing writer

Note:

  1. All formatters from the TYPE_formatter family use the most natural format, which is what is expected "most of the time".
    When used for reading, they: a) skip whitespace and b) read the token as long as it makes sense. Skipping whitespace is not performed when reading with char_formatter.
  2. The formatters from the TYPE_printf_formatter family delegate to snprintf and use the format stored in writer. There is a special writer adapter that adds this property to any writer, see below.

Using lists of formatters

Formatters are only types and are never instantiated. They are provided to the stream in the form of a typelist as the first template parameter of the stream type.
For example, the formatters type can be used to build a list from the first three formatters in the table above:

typedef formatters<char_formatter, charp_formatter, string_formatter> my_three_formatters;

Now, the my_three_formatters is a typelist that can be used to create stream objects.
If you wan to add one more formatter type (for example, the int_formatter) to the already existing list, you can do this the following way:

typedef formatters_list<int_formatter, my_three_formatters> my_four_formatters;

Now, the my_four_formatters is a list of four formatters, where int_formatter is the first on the list.
The formatters_list type can be only used to prepend a single type to the already existing list, like in the above example.

There are two predefined lists, ready for use:

  1. basic_formatters - this list includes all formatters from the table above, from the TYPE_formatter family.
  2. printf_formatters - this list includes all formatters from the TYPE_printf_formatter family.

Example:

char buf[1000];
ofastream<basic_formatters, bounded_memory_writer> one_stream(buf, 1000);
ofastream<printf_formatters, cfile_writer> other_stream(stdout);

Note: the list of formatters is searched from the beginning and the search stops when the first matching formatter is found. This means that new formatters can effectively "replace" formatters from the existing list, like here:

typedef formatters_list<my_better_int_formatter, basic_formatters> my_formatters;
ofastream<my_formatters, cfile_writer> my_stream(stdout);

Above, the my_stream object will use all basic formatters, but for int formatting the my_better_int_formatter will be used, because it will be found before int_formatter in the original basic_formatters list.

Writers ready for use

The following writers are ready for use:

Writer type what it does constructor parameters additional functionality
class unchecked_memory_writer; Writes to the memory buffer, does not check against buffer overruns.
(Do not use it at home.)
char *buf void reset(char *buf);
char * position() const;
class bounded_memory_writer; Writes to the memory buffer, but checks the given buffer size limit. char *buf, size_t size void reset(char *buf);
char * position() const;
size_t left() const;
template <class Sequence>
class sequence_writer;
Appends characters (using push_back) to the given STL container. Container *cont void reset(Container *cont);
class cfile_writer; Writes to the C-style FILE*. std::FILE *file void reset(std::FILE *file);
class unix_writer; Writes to the Unix descriptor. int fd void reset(int fd);

Note:Writers do not own resources (buffers, descriptors, etc.). These resources have to be managed separately.

In addition, there is a handy typedef:

typedef sequence_writer<std::string> string_writer;

Examples:

char buf[1000];
ofastream<basic_formatters, unchecked_memory_writer> s1(buf);
ofastream<basic_formatters, bounded_memory_writer> s2(buf, 1000);

std::vector<char> v;
ofastream<basic_formatters, sequence_writer<std::vector<char> > > s3(&v);

std::string str;
ofastream<basic_formatters, string_writer> s4(&str);

std::FILE *file = std::fopen("somefile.txt", "w");
ofastream<basic_formatters, cfile_writer> s5(file);

int fd = open("somefile.txt", O_WRONLY);
ofastream<basic_formatters, unix_writer> s6(fd);

Above, all s1 ... s6 are output streams that use basic formatters.
In addition, the output streams are also writers themselves and therefore expose all the functions defined in the given writers, including those functions that are marked as additional in the table above.

All writers provide the following diagnostic functions:

bool end_of_stream() const;
bool device_error() const;
bool good() const { return !device_error(); }

Readers ready for use

The following readers are ready for use:

Reader type what it does constructor parameters rewindable additional functionality
class unchecked_memory_reader; Reads from the memory buffer, does not check for running off the end of buffer. (Do not use it at home.) char const *buf Yes void reset(char const *buf);
char const * position() const;
class bounded_memory_reader; Reads from the memory buffer, checks for the buffer size. char const *buf, size_t size Yes void reset(char const *buf, size_t size);
char const * position() const;
size_t left() const;
template <class Sequence>
class sequence_reader;
Reads characters from the given STL container. Container const *cont Yes void reset(Container const *cont);
class cfile_reader; Reads from the C-style FILE*. std::FILE *file No void reset(std::FILE *file);
class unix_reader; Reads from the Unix descriptor. int fd No void reset(int fd);

Note: Readers do not own resources (buffers, descriptors, etc.). These resources have to be managed separately.

Note: If the reader is rewindable, it provides the additional function:

void go_back(size_t s);

This function allows to "go back" the given number of elements and to read the same content again.

In addition, there is a handy typedef:

typedef sequence_reader<std::string> string_reader;

Examples:

char buf[1000];
ifastream<basic_formatters, unchecked_memory_reader> s1(buf);
ifastream<basic_formatters, bounded_memory_reader> s2(buf, 1000);

std::vector<char> v;
ifastream<basic_formatters, sequence_reader<std::vector<char> > > s3(&v);

std::string str;
ifastream<basic_formatters, string_reader> s4(&str);

std::FILE *file = std::fopen("somefile.txt", "r");
ifastream<basic_formatters, cfile_reader> s5(file);

int fd = open("somefile.txt", O_RDONLY);
ifastream<basic_formatters, unix_reader> s6(fd);

Above, all s1 ... s6 are input streams that use basic formatters.
In addition, the input streams are also readers themselves and therefore expose all the functions defined in the given readers, including those functions that are marked as additional in the table above.

All readers provide the following diagnostic functions:

bool end_of_stream() const;
bool device_error() const;
void set_bad_format(bool b);
bool bad_format() const;
bool good() const { return !device_error() && !bad_format(); }

Adapters

Adapters allow to "decorate" existing readers and writers with some additioinal functionality.
The following adapters are defined:

1.

template <class Writer, int BufSize = 100>
class printf_formats;

The printf_formats adapter can be used to wrap any writer and provide the following additional functions:

void set_char_format  (std::string const &f);
void set_charp_format (std::string const &f);
void set_long_format  (std::string const &f);
void set_ulong_format (std::string const &f);
void set_double_format(std::string const &f);
void set_voidp_format (std::string const &f);

char const * get_char_format();
char const * get_charp_format();
char const * get_long_format();
char const * get_ulong_format();
char const * get_double_format();
char const * get_voidp_format();

These functions are expected and used by the formatters from the TYPE_printf_formatter family.

Example:

ofastream<printf_formatters, printf_formats<bounded_memory_writter> > s(buf, size);
s.set_double_format("%.2f");
s << 3.141592536;

Above, only "3,14" will be written to the memory buffer.

The second template parameter (BufSize) is a suggested size of buffer used by formatters.

2.

template <class Writer, int BufSize>
class buffered_writer;

The buffered_writer adapter can be used to wrap any writer and provide it with buffering capability.
In addition, it provides the following function:

void flush();

This function forces the content of the buffer to be written by the underlying writer.

Example:

int fd = open("somefile.txt", O_WRONLY);
ofastream<basic_formatters, buffered_writer<unix_writer, 4096> > s(fd);
s << ...;
s.flush();
close(fd);

3.

template <class Reader, int BufSize>
class buffered_reader;

The buffered_reader adapter can be used to wrap any reader and provide it with buffering capability.
In addition, it also adds the rewindable property. The max size of the possible rewind is equal to the buffer size.

Example:

int fd = open("somefile.txt", O_RDONLY);
ifastream<basic_formatters, buffered_reader<unix_reader, 4096> > s(fd);
s >> ...;
close(fd);

Above, buffered_reader is needed not only to provide buffering to non-buffered unix_reader, but also to provide the rewindable property, which is required by most of the formatters from the basic_formatters list.

Useful functions and definitions

1.

template <class Reader>
Reader & skip_whitespace(Reader &r);

The function above skips whitespace in the given reader.

2.

template <class Reader>
Reader & get_line(Reader &r, std::string &s);

The function above reads the full line of text (up to, but not including, the end-of-line character) from the given reader.

3.

typedef ifastream<basic_formatters, buffered_reader<unix_reader, 4096> > unix_ifastream;
typedef ofastream<basic_formatters, buffered_writer<unix_writer, 4096> > unix_ofastream;
typedef ifastream<basic_formatters, buffered_reader<cfile_reader, 4096> > cfile_ifastream;
typedef ofastream<basic_formatters, cfile_writer> cfile_ofastream;

These typedefs are handy in common situations, like the "Hello World!" program.

4.

The input and output streams have an implicit conversion to bool-like type (they delegate to good() from their reader or writer parts), so they can be used idiomatically in conditional instructions:

cfile_ifastream in(stdin);
int n;
while (in >> n) { /* .... */ }

Iterator support

There are input and output iterators for use with fastreams.
Example:

char buf[1000];
std::vector<int> v;
typedef ofastream<basic_formatters, bounded_memory_writer> stream_type;
stream_type s(buf, 1000);
ofastream_iterator<stream_type, int> iter(s, " ");
std::copy(v.begin(), v.end(), iter);

Above, the whole content of vector v will be formatted as text (numbers separated by single space) in the buffer buf.

Another example:

cfile_ifastream s(stdin);
ifastream_iterator<cfile_ifastream, int> begin(s), end;
std::vector<int> v(begin, end);

Above, the integers will be read from standard input (up to end-of-stream or formatting error) and fed into the newly created vector.

Lexical casts

Two functions are provided that allow to use output and input formatting together to perform a "lexical cast" (see also Boost lexical cast) between two types:

  1. fast_lexical_cast - uses the unchecked internal memory buffer of arbitrary size.
  2. safe_lexical_cast - uses the std::string buffer.

Examples:

int i = ...;
std::string s = fast_lexical_cast<std::string>(i);

i = safe_lexical_cast<int>(s);

If the cast cannot be performed due to formatting errors, the bad_lexical_cast exception is thrown.

Run-time polymorphism

Dynamic binding is completely avoided in all the examples presented so far, but is possible to use when there's a real need to do so.
Three helper types are used to provide a dynamic scaffolding that enables the run-time polymorphism.
For example, the following function uses the writer_base type, which abstracts away the actual writer in use:

void fun(writer_base &wb)
{
     ofastream<basic_formatters, dynamic_writer> s(&wb);
     s << ...;
}

Note that the dynamic_writer is used to write to the writer_base.
In another place of the program, the following is possible:

bounded_memory_writer mw(buf, size);
dynamic_writer_adapter<bounded_memory_writer> dmw(mw);
fun(dmw);

cfile_writer cw(stdout);
dynamic_writer_adapter<cfile_writer> dcw(cw);
fun(dcw);

// ...

Above, the function fun was reused with two different physical writers.

In other words:

Similarly, there are reader_base, dynamic_reader and dynamic_reader_adapter to provide run-time polymorphism for input streams.

Performance

This section presents the results of three different performance tests and compares them with the results of standard solutions.
All tests were run on the machine with Intel Pentium III clocked 933 MHz.

Test 1.

In this test, a very long (10MB) string was built from single characters, where each character was inserted into the stream like here:

stream << 'a';

The following were compared:

The results are:

Chart for test 1.

Above, the Y axis is scaled in millions of operations per second.

Test 2.

In this test, one million of ints were formatted using the following operation:

stream << i;

The following were compared:

The results are:

Chart for test 2.

Above, the Y axis is scaled in millions of operations per second.

Test 3.

In this test, the lexical cast function was used to convert from int to std::string and back to int, using the following operations:

s = lexical_cast<string>(i);
i = lexical_cast<int>(s);

The following were compared:

The results are:

Chart for test 3.

Above, the Y axis is scaled in thousands of operations per second.

Architecture and Concepts

Note: This section is intended for those who want to extend the FASTreams library.

There are two kinds of streams: output and input.
Output (input) stream is composed of:

  1. the list of formatters, and
  2. the writer (reader) object.

Each formatter is a type that should be similar to this:

struct some_formatter
{
    typedef some_type formatted_type;

    // if used in output stream
    template <class Writer>
    static void write(some_type t, Writer &w);

    // if used in input stream
    template <class Reader>
    static void read(some_type &t, Reader &r);
};

(the exact signatures above are not important, only usage syntax matters)

A list of formatters is a compile-time type-list, formatters are never instantiated.
The list of formatters makes part of the stream type.

A writer object must provide at least the following operations:

class some_writer
{
public:

    // write operations
    void write(char c);

    template <class Iterator>
    void write(Iterator b, Iterator e);

    // stream state and error detection
    bool end_of_stream() const;
    bool device_error() const;
    bool good() const { return !device_error() }
};

Above, Iterator should give a char (or something convertible to char) when dereferenced.
In addition, writers may provide any additional operation they like, which can be useful for some formatter and adapter types.

A reader object must provide at least the following operations:

class some_reader
{
public:

    // read operations
    bool read(char &c);

    template <class Iterator>
    size_t read(Iterator b, size_t elems); // returns the number of chars read

    // optionally (for rewindable readers)
    void go_back(size_t s);

    // stream state and error detection
    bool end_of_stream() const;
    bool device_error() const;
    void set_bad_format(bool b);
    bool bad_format() const;
    bool good() const { return !device_error() && !bad_format(); }
};

Above, Iterator will be fed the requested elems (or less) chars.

If the reader implements the go_back operation, it is a rewindable reader.

The output stream class keeps together the list of formatters and the writer object:

template <class Formatters, class Writer>
class ofastream : public Writer
{
public:

    template <typename T>
    ofastream & insert(T const &t)
    {
         // finds the appropriate formatter in the given list and uses it
         typedef ... formatter_type;

         formatter_type::write(t, *this);

         return *this;
    }

    // delegates to Writer::good()
    operator unspecified_bool_type() const;
};

The input stream class keeps together the list of formatters and the reader object:

template <class Formatters, class Reader>
class ifastream : public Reader
{
public:

    template <typename T>
    ifastream & extract(T &t)
    {
         // finds the appropriate formatter in the given list and uses it
         typedef ... formatter_type;

         formatter_type::read(t, *this);

         return *this;
    }

    // delegates to Reader::good()
    operator unspecified_bool_type() const;
};

Both stream classes derive publicly from the writer/reader classes, meaning that they expose all the operations (including those that are not required) provided by writer/reader.

Of course, there are also insertion (<<) and extraction (>>) operators that delegate to appropriate function templates (insert/extract) in stream classes.

Known problems

  1. The cfile_ifastream should not be used for line-oriented interactive input. It uses std::fread as the reading mechanism, which blocks until the requested number of bytes is available - this will appear like program hanging when used for keyboard input unless end-of-stream is forced (Ctrl-D on Unix).
    Use unix_ifastream (with explicit 0 file descriptor) instead.