General focus of the library

  • The generic dataflow layer focuses mostly on connection and extraction functionality of a dataflow framework, since this functionality should be present in all dataflow frameworks (going to the very definition of dataflow).
    • Generic treatment of component connection and data extraction is useful because, for example, much of current dataflow framework development is dedicated to visual programming environments, and bindings to outside languages. Rather than reinventing the wheel for every framework separately, much of this can be done in a generic fashion instead.
    • While there are many other functionalities that can be part of a dataflow framework (such as memory management and different kinds of scheduling), these vary much more from one dataflow framework to another, and are much harder to capture in the generic layer. While concepts related to memory management and scheduling might be added to the generic layer in the future, I believe they are not essential in the first version of this library.
  • The only dataflow framework included at the moment is Dataflow.Signals for three reasons:
    • a Boost.Signals-based framework was the original focus of this project;
    • Dataflow.Signals uses existing Boost functionality to make the implementation of the dataflow framework extremely simple.
    • components in other frameworks (e.g., VTK) don't rely exclusively on their framework's dataflow mechanisms to do their work. For example, apart from connecting VTK components and moving data through VTK's pipeline, the user also needs to call the components' methods to customize their behavior). Since Boost.Signals is based on data transfer through function calls, it can be used to expose VTK components' methods in a way that is native to the Dataflow library.
  • There is no doubt that implementing other kinds of dataflow frameworks would be useful (e.g., the pin-based approach proposed by Tobias Schwinger, or something like the simulation-oriented Coupled DEVS model). However, implementing a whole new framework would take significant time and effort, and I don't believe that it is essential in the first version of this library.

Naming conventions

  • I encountered the term dataflow after starting this library. Although the information I can find about dataflow as a programming paradigm seems to be a little sub-par, it seems like the right name for this library. As a part of developing this library and doing the accompanying research I can work on improving the information that is out there (on Wikipedia, etc.)
  • I chose component to refer to a processing element of a dataflow network, because the term has no prior (to my knowledge) C++ meaning.
  • In differentiating ports that output data from those that input data, I originally thought about using "input" and "output". However, I realized that "input" and "output" switch depending on perspective - from the perspective of a port, it is "output" if it outputs data, but from the perspective of a corresponding connection, that same port is the "input". A better choice seemed to be producer and consumer as they don't suffer from the same problem.

Implementation choices

  • Among various tag dispatch conventions, I chose to go with function object templates because of relative simplicity as well as integration with boost::result_of. Using a function might be another good alternative (as suggested here), but might be problematic where the result type is not fixed (and in the future all of the operations will probably be modified to return a value where appropriate).
  • Most operations (as parts of BinaryOperable, UnaryOperable, and ComponentOperable concepts) have been rolled into a single class template with the operation selected by type because of some shared functionality, and this way they can also be treated in a generic fashion in code built on top of the generic layer.