This program is classifies events for later filtering. It operates by sending blocks of data to a set of parallel workers. The workers add an extra uint32_t to the event body headers. The events are then resorted by timestamp and output to some event sink.
The classification is done by loading a shared library provided by the user. the shared library provides a classification class and a factory function that creates the instances of that class needed by the application.
Most options are mandatory. Optional options are indicated.
--help
(optional)Prints brief program help text to stdout. After printing the help text, the program exits.
--help
(optional)Prints out the program's version to stdout. Once the version is output, the program exits.
--source
=URIA file or tcp URI that specifies where data will come from. Data are analyzed and sent to the sink as long as there is no end indication from the source. Note that this means that for ring buffers, the program will run indefinitely.
--sink
=URIA file or tcp URI that specifies where the output data will be written. Note that if this is a tcp URI, the host must be local else an error will abort the program.
--workers
=integer (optional)
Specifies the number of parallel workers to be used in threaded
parallelism. Note that for MPI parallelism, the
-np
option on mpirun
can override this value.
If not provided, the value of this option defaults to 1.
--clump-size
=integerSpecifies the number of ring items that make up a work unit passed to the workers. If not supplied, this defaults to 1 which is almost certainly not optimal both for communication and for post worker timestamp ordering.
--classifier
=filenameProvides the path to the shared object that will be loaded. See CLASSIFIER LIBRARY below for more information on what that library must contain.
--parallel-strategy
= threaded | mpiSpecifies the parallelization strategy/communications infrastructure used. If threaded, all operations are performed on the same system using threading to gain parallelism. Each worker is a thread. In addition there are, one each, threads for input,time stamp reordering and output.
Note that if mpi is selected, you must
run the program with mpirun. The
mpirun -np
option
will be in conflict with th --workers
if it is not numworkers + 3. If this is
not the case but there are still sufficient processes to
run at least one worker, -np
overrides
--workers
. A message is output
indicating this was done, and the application continues.
If -np
is less than 4,
the mininmum number of processes needed for at least one
worker, an error message is output and the application exits.
In order to incorporate user written classifiers, the user must
provide a shared library specified by the --classifier
option. This shared library must include both an implementation
of a concrete subclass of CRingMarkingWorker::Classifier
defined in CRingItemMarkingWorker.h, and a factory
function with C external bindings named createClassifier
that can create instances of the classifier.
The classifier must implement a method
virtual uint32_t operator()(CRingItem& item);
This method is expected to produce a uint32_t classification
value for the CRingItem
object referenced
by item
. The resulting classification
is appended to the body header and sizes are all adjusted to
make everything look good.
Here's an example of a factory function that produces
a TestClassifier
object:
Note the use of extern "C" to remove C++ name mangling. The function just creates a new classifier and returns a pointer to it.