API Documentation
Accumux provides a framework for compositional online data reductions using algebraic composition. This document covers the core concepts, accumulator types, and composition operations.
Core Concept: Accumulator
All accumulators satisfy this C++20 concept:
template<typename T>
concept Accumulator = requires(std::remove_cvref_t<T> acc,
typename std::remove_cvref_t<T>::value_type val) {
typename std::remove_cvref_t<T>::value_type; // Value type
{ std::remove_cvref_t<T>{} }; // Default constructible
{ std::remove_cvref_t<T>{acc} }; // Copy constructible
{ acc += val } -> std::same_as<std::remove_cvref_t<T>&>; // Value accumulation
{ acc += acc } -> std::same_as<std::remove_cvref_t<T>&>; // Accumulator combination
{ acc.eval() } -> std::convertible_to<typename std::remove_cvref_t<T>::value_type>;
{ acc = acc } -> std::same_as<std::remove_cvref_t<T>&>;
};
Core Accumulators
| Accumulator | Purpose | Algorithm | Complexity |
|---|---|---|---|
kbn_sum<T> | Numerically stable summation | Kahan-Babuska-Neumaier | O(1) space |
welford_accumulator<T> | Mean, variance, std dev | Welford’s online algorithm | O(1) space |
min_accumulator<T> | Minimum value tracking | Simple comparison | O(1) space |
max_accumulator<T> | Maximum value tracking | Simple comparison | O(1) space |
minmax_accumulator<T> | Combined min/max | Optimized dual tracking | O(1) space |
count_accumulator | Count of elements | Simple counter | O(1) space |
product_accumulator<T> | Product with overflow protection | Logarithmic representation | O(1) space |
kbn_sum
Numerically stable summation using the Kahan-Babuska-Neumaier algorithm.
#include "accumux/accumulators/kbn_sum.hpp"
kbn_sum<double> sum;
for (double x : data) {
sum += x;
}
double result = sum.eval();
Key Methods:
| Method | Description |
|---|---|
operator+=(T value) | Add a value |
operator+=(kbn_sum other) | Combine two sums |
eval() | Get the sum result |
welford_accumulator
Online computation of mean and variance using Welford’s algorithm.
#include "accumux/accumulators/welford.hpp"
welford_accumulator<double> stats;
for (double x : data) {
stats += x;
}
double mean = stats.mean();
double variance = stats.sample_variance();
double std_dev = stats.sample_std_dev();
size_t count = stats.count();
Key Methods:
| Method | Description |
|---|---|
mean() | Running mean |
sample_variance() | Sample variance (n-1 denominator) |
population_variance() | Population variance (n denominator) |
sample_std_dev() | Sample standard deviation |
count() | Number of values processed |
minmax_accumulator
Track minimum and maximum values simultaneously.
#include "accumux/accumulators/minmax.hpp"
minmax_accumulator<double> mm;
for (double x : data) {
mm += x;
}
double min_val = mm.min();
double max_val = mm.max();
Composition Operations
| Operation | Syntax | Meaning | Use Case |
|---|---|---|---|
| Parallel | a + b | Both process same data | Multiple statistics |
| Sequential | a * b | Pipeline: b(a(data)) | Data transformations |
| Conditional | conditional(a, b, pred) | Choose based on condition | Adaptive processing |
Parallel Composition
// Both accumulators see every value
auto stats = kbn_sum<double>() + welford_accumulator<double>();
for (double x : data) {
stats += x; // Both accumulate
}
auto sum = stats.get_first().eval();
auto mean = stats.get_second().mean();
Sequential Composition
// Pipeline: second sees output of first
auto pipeline = transform_acc<double, double>(
[](double x) { return x * 2; }
) * kbn_sum<double>();
for (double x : data) {
pipeline += x; // Transformed then summed
}
Nesting Compositions
Compositions are themselves accumulators, enabling arbitrary nesting:
auto complex_stats = kbn_sum<double>() +
welford_accumulator<double>() +
minmax_accumulator<double>();
// Accessing nested results
auto sum = complex_stats.get_first().eval();
auto welford = complex_stats.get_second();
auto minmax = welford.get_second();
Mathematical Guarantees
- Associativity:
(a + b) + cstructurally equivalent toa + (b + c) - Composability: All compositions satisfy
Accumulatorconcept - Type Safety: Compile-time verification via C++20 concepts
- Numerical Stability: Error bounds of O(1) vs O(n) for naive algorithms
- Single-Pass: All compositions maintain streaming efficiency
Numerical Stability
KBN vs Naive Summation
Naive sum: Error grows as O(n * epsilon)
KBN sum: Error bounded by O(1 * epsilon)
For 1M values, this can mean the difference between 10^-8 and 10^-16 relative error.
Welford vs Two-Pass Variance
Welford’s algorithm:
- Single pass (streaming compatible)
- Numerically stable for large datasets
- No intermediate storage required
Performance
| Operation | Throughput |
|---|---|
| Single KBN Sum | ~476M values/sec |
| Parallel Sum + Welford | ~208M values/sec |
| Complex 4-way composition | ~122M values/sec |
All operations use O(1) memory regardless of data size.
Full Reference
For complete API documentation: