Back to accumux

Documentation

API reference and core concepts for accumux.

API Documentation

Accumux provides a framework for compositional online data reductions using algebraic composition. This document covers the core concepts, accumulator types, and composition operations.

Core Concept: Accumulator

All accumulators satisfy this C++20 concept:

template<typename T>
concept Accumulator = requires(std::remove_cvref_t<T> acc,
                               typename std::remove_cvref_t<T>::value_type val) {
    typename std::remove_cvref_t<T>::value_type;           // Value type
    { std::remove_cvref_t<T>{} };                          // Default constructible
    { std::remove_cvref_t<T>{acc} };                       // Copy constructible
    { acc += val } -> std::same_as<std::remove_cvref_t<T>&>;   // Value accumulation
    { acc += acc } -> std::same_as<std::remove_cvref_t<T>&>;   // Accumulator combination
    { acc.eval() } -> std::convertible_to<typename std::remove_cvref_t<T>::value_type>;
    { acc = acc } -> std::same_as<std::remove_cvref_t<T>&>;
};

Core Accumulators

AccumulatorPurposeAlgorithmComplexity
kbn_sum<T>Numerically stable summationKahan-Babuska-NeumaierO(1) space
welford_accumulator<T>Mean, variance, std devWelford’s online algorithmO(1) space
min_accumulator<T>Minimum value trackingSimple comparisonO(1) space
max_accumulator<T>Maximum value trackingSimple comparisonO(1) space
minmax_accumulator<T>Combined min/maxOptimized dual trackingO(1) space
count_accumulatorCount of elementsSimple counterO(1) space
product_accumulator<T>Product with overflow protectionLogarithmic representationO(1) space

kbn_sum

Numerically stable summation using the Kahan-Babuska-Neumaier algorithm.

#include "accumux/accumulators/kbn_sum.hpp"

kbn_sum<double> sum;
for (double x : data) {
    sum += x;
}
double result = sum.eval();

Key Methods:

MethodDescription
operator+=(T value)Add a value
operator+=(kbn_sum other)Combine two sums
eval()Get the sum result

welford_accumulator

Online computation of mean and variance using Welford’s algorithm.

#include "accumux/accumulators/welford.hpp"

welford_accumulator<double> stats;
for (double x : data) {
    stats += x;
}

double mean = stats.mean();
double variance = stats.sample_variance();
double std_dev = stats.sample_std_dev();
size_t count = stats.count();

Key Methods:

MethodDescription
mean()Running mean
sample_variance()Sample variance (n-1 denominator)
population_variance()Population variance (n denominator)
sample_std_dev()Sample standard deviation
count()Number of values processed

minmax_accumulator

Track minimum and maximum values simultaneously.

#include "accumux/accumulators/minmax.hpp"

minmax_accumulator<double> mm;
for (double x : data) {
    mm += x;
}

double min_val = mm.min();
double max_val = mm.max();

Composition Operations

OperationSyntaxMeaningUse Case
Parallela + bBoth process same dataMultiple statistics
Sequentiala * bPipeline: b(a(data))Data transformations
Conditionalconditional(a, b, pred)Choose based on conditionAdaptive processing

Parallel Composition

// Both accumulators see every value
auto stats = kbn_sum<double>() + welford_accumulator<double>();

for (double x : data) {
    stats += x;  // Both accumulate
}

auto sum = stats.get_first().eval();
auto mean = stats.get_second().mean();

Sequential Composition

// Pipeline: second sees output of first
auto pipeline = transform_acc<double, double>(
    [](double x) { return x * 2; }
) * kbn_sum<double>();

for (double x : data) {
    pipeline += x;  // Transformed then summed
}

Nesting Compositions

Compositions are themselves accumulators, enabling arbitrary nesting:

auto complex_stats = kbn_sum<double>() +
                     welford_accumulator<double>() +
                     minmax_accumulator<double>();

// Accessing nested results
auto sum = complex_stats.get_first().eval();
auto welford = complex_stats.get_second();
auto minmax = welford.get_second();

Mathematical Guarantees

  • Associativity: (a + b) + c structurally equivalent to a + (b + c)
  • Composability: All compositions satisfy Accumulator concept
  • Type Safety: Compile-time verification via C++20 concepts
  • Numerical Stability: Error bounds of O(1) vs O(n) for naive algorithms
  • Single-Pass: All compositions maintain streaming efficiency

Numerical Stability

KBN vs Naive Summation

Naive sum:  Error grows as O(n * epsilon)
KBN sum:    Error bounded by O(1 * epsilon)

For 1M values, this can mean the difference between 10^-8 and 10^-16 relative error.

Welford vs Two-Pass Variance

Welford’s algorithm:

  • Single pass (streaming compatible)
  • Numerically stable for large datasets
  • No intermediate storage required

Performance

OperationThroughput
Single KBN Sum~476M values/sec
Parallel Sum + Welford~208M values/sec
Complex 4-way composition~122M values/sec

All operations use O(1) memory regardless of data size.

Full Reference

For complete API documentation: