Getting Started with Accumux
This tutorial walks you through using accumux for streaming statistical computations with algebraic composition.
Prerequisites
- C++20 compatible compiler (GCC 10+, Clang 10+, MSVC 19.29+)
- CMake 3.14+ (for building tests)
- No external dependencies (header-only library)
Installation
Header-Only Integration
Simply copy the include/accumux/ directory to your project:
#include "accumux/accumulators/kbn_sum.hpp"
#include "accumux/accumulators/welford.hpp"
#include "accumux/core/composition.hpp"
CMake Integration
add_subdirectory(accumux)
target_link_libraries(your_target accumux::accumux)
Tutorial 1: Basic Accumulators
Numerically Stable Summation
The naive approach to summation accumulates floating-point error:
// BAD: Error grows with data size
double naive_sum = 0;
for (double x : data) {
naive_sum += x;
}
Use kbn_sum for numerically stable summation:
#include "accumux/accumulators/kbn_sum.hpp"
using namespace accumux;
// GOOD: Error bounded regardless of data size
kbn_sum<double> sum;
for (double x : data) {
sum += x;
}
double result = sum.eval();
Computing Statistics with Welford
Welford’s algorithm computes mean and variance in a single pass:
#include "accumux/accumulators/welford.hpp"
using namespace accumux;
welford_accumulator<double> stats;
std::vector<double> data = {1.0, 2.0, 3.0, 4.0, 5.0};
for (double x : data) {
stats += x;
}
std::cout << "Count: " << stats.count() << std::endl; // 5
std::cout << "Mean: " << stats.mean() << std::endl; // 3.0
std::cout << "Variance: " << stats.sample_variance() << std::endl; // 2.5
std::cout << "Std Dev: " << stats.sample_std_dev() << std::endl; // 1.58...
Tracking Min and Max
#include "accumux/accumulators/minmax.hpp"
using namespace accumux;
minmax_accumulator<double> mm;
for (double x : data) {
mm += x;
}
std::cout << "Min: " << mm.min() << std::endl;
std::cout << "Max: " << mm.max() << std::endl;
Tutorial 2: Composition
The power of accumux is algebraic composition: combine accumulators naturally.
Parallel Composition with +
Process data through multiple accumulators simultaneously:
#include "accumux/core/composition.hpp"
using namespace accumux;
// Create composed accumulator
auto stats = kbn_sum<double>() + welford_accumulator<double>();
// Both see every value
std::vector<double> data = {1.0, 2.0, 3.0, 4.0, 5.0};
for (double x : data) {
stats += x;
}
// Extract results from each component
auto sum_result = stats.get_first();
auto welford_result = stats.get_second();
std::cout << "Sum: " << sum_result.eval() << std::endl; // 15.0
std::cout << "Mean: " << welford_result.mean() << std::endl; // 3.0
Multi-Way Composition
Compose any number of accumulators:
auto comprehensive = kbn_sum<double>() +
welford_accumulator<double>() +
minmax_accumulator<double>();
for (double x : data) {
comprehensive += x;
}
// Navigate the nested structure
auto sum = comprehensive.get_first().eval();
auto mean = comprehensive.get_second().mean();
auto min_val = comprehensive.get_second().get_second().min();
auto max_val = comprehensive.get_second().get_second().max();
Tutorial 3: Real-Time Analytics
Accumux excels at streaming scenarios where you need statistics at any moment.
Financial Data Processing
auto market_stats = kbn_sum<double>() +
welford_accumulator<double>() +
minmax_accumulator<double>();
// Simulate streaming market data
while (auto price = get_next_price()) {
market_stats += *price;
// Real-time metrics available at any time
if (should_report()) {
auto welford = market_stats.get_second();
auto minmax = welford.get_second();
std::cout << "Total: " << market_stats.get_first().eval() << std::endl;
std::cout << "Mean: " << welford.mean() << std::endl;
std::cout << "Volatility: " << welford.sample_std_dev() << std::endl;
std::cout << "Range: [" << minmax.min() << ", " << minmax.max() << "]" << std::endl;
}
}
Sensor Data Analysis
auto experiment = welford_accumulator<double>() + count_accumulator();
for (const auto& measurement : sensor_data) {
experiment += measurement;
// Check for anomalies in real-time
auto stats = experiment.get_first();
if (stats.count() > 10 && stats.sample_std_dev() > threshold) {
std::cout << "High variability detected!" << std::endl;
}
}
Tutorial 4: Combining Accumulators
Accumulators from different sources can be merged:
// Process data in parallel threads
kbn_sum<double> sum1, sum2;
// Thread 1
for (size_t i = 0; i < n/2; ++i) {
sum1 += data[i];
}
// Thread 2
for (size_t i = n/2; i < n; ++i) {
sum2 += data[i];
}
// Combine results
sum1 += sum2;
double total = sum1.eval();
This also works for composed accumulators:
auto stats1 = kbn_sum<double>() + welford_accumulator<double>();
auto stats2 = kbn_sum<double>() + welford_accumulator<double>();
// Process different data chunks
for (size_t i = 0; i < n/2; ++i) stats1 += data[i];
for (size_t i = n/2; i < n; ++i) stats2 += data[i];
// Combine
stats1 += stats2;
Tutorial 5: Building and Testing
Build with CMake
git clone https://github.com/queelius/accumux.git
cd accumux
mkdir build && cd build
cmake -DACCUMUX_BUILD_TESTS=ON ..
make -j4
Run Tests
ctest
The test suite includes 183 tests covering core accumulators, compositions, and numerical accuracy.
Next Steps
- See Examples for complete working programs
- Read the API Documentation for full reference
- Check the Research Paper for mathematical foundations