Back to accumux

Tutorials

Step-by-step guides for using accumux.

Getting Started with Accumux

This tutorial walks you through using accumux for streaming statistical computations with algebraic composition.

Prerequisites

  • C++20 compatible compiler (GCC 10+, Clang 10+, MSVC 19.29+)
  • CMake 3.14+ (for building tests)
  • No external dependencies (header-only library)

Installation

Header-Only Integration

Simply copy the include/accumux/ directory to your project:

#include "accumux/accumulators/kbn_sum.hpp"
#include "accumux/accumulators/welford.hpp"
#include "accumux/core/composition.hpp"

CMake Integration

add_subdirectory(accumux)
target_link_libraries(your_target accumux::accumux)

Tutorial 1: Basic Accumulators

Numerically Stable Summation

The naive approach to summation accumulates floating-point error:

// BAD: Error grows with data size
double naive_sum = 0;
for (double x : data) {
    naive_sum += x;
}

Use kbn_sum for numerically stable summation:

#include "accumux/accumulators/kbn_sum.hpp"
using namespace accumux;

// GOOD: Error bounded regardless of data size
kbn_sum<double> sum;
for (double x : data) {
    sum += x;
}
double result = sum.eval();

Computing Statistics with Welford

Welford’s algorithm computes mean and variance in a single pass:

#include "accumux/accumulators/welford.hpp"
using namespace accumux;

welford_accumulator<double> stats;

std::vector<double> data = {1.0, 2.0, 3.0, 4.0, 5.0};
for (double x : data) {
    stats += x;
}

std::cout << "Count: " << stats.count() << std::endl;           // 5
std::cout << "Mean: " << stats.mean() << std::endl;             // 3.0
std::cout << "Variance: " << stats.sample_variance() << std::endl;  // 2.5
std::cout << "Std Dev: " << stats.sample_std_dev() << std::endl;    // 1.58...

Tracking Min and Max

#include "accumux/accumulators/minmax.hpp"
using namespace accumux;

minmax_accumulator<double> mm;

for (double x : data) {
    mm += x;
}

std::cout << "Min: " << mm.min() << std::endl;
std::cout << "Max: " << mm.max() << std::endl;

Tutorial 2: Composition

The power of accumux is algebraic composition: combine accumulators naturally.

Parallel Composition with +

Process data through multiple accumulators simultaneously:

#include "accumux/core/composition.hpp"
using namespace accumux;

// Create composed accumulator
auto stats = kbn_sum<double>() + welford_accumulator<double>();

// Both see every value
std::vector<double> data = {1.0, 2.0, 3.0, 4.0, 5.0};
for (double x : data) {
    stats += x;
}

// Extract results from each component
auto sum_result = stats.get_first();
auto welford_result = stats.get_second();

std::cout << "Sum: " << sum_result.eval() << std::endl;     // 15.0
std::cout << "Mean: " << welford_result.mean() << std::endl;  // 3.0

Multi-Way Composition

Compose any number of accumulators:

auto comprehensive = kbn_sum<double>() +
                     welford_accumulator<double>() +
                     minmax_accumulator<double>();

for (double x : data) {
    comprehensive += x;
}

// Navigate the nested structure
auto sum = comprehensive.get_first().eval();
auto mean = comprehensive.get_second().mean();
auto min_val = comprehensive.get_second().get_second().min();
auto max_val = comprehensive.get_second().get_second().max();

Tutorial 3: Real-Time Analytics

Accumux excels at streaming scenarios where you need statistics at any moment.

Financial Data Processing

auto market_stats = kbn_sum<double>() +
                    welford_accumulator<double>() +
                    minmax_accumulator<double>();

// Simulate streaming market data
while (auto price = get_next_price()) {
    market_stats += *price;

    // Real-time metrics available at any time
    if (should_report()) {
        auto welford = market_stats.get_second();
        auto minmax = welford.get_second();

        std::cout << "Total: " << market_stats.get_first().eval() << std::endl;
        std::cout << "Mean: " << welford.mean() << std::endl;
        std::cout << "Volatility: " << welford.sample_std_dev() << std::endl;
        std::cout << "Range: [" << minmax.min() << ", " << minmax.max() << "]" << std::endl;
    }
}

Sensor Data Analysis

auto experiment = welford_accumulator<double>() + count_accumulator();

for (const auto& measurement : sensor_data) {
    experiment += measurement;

    // Check for anomalies in real-time
    auto stats = experiment.get_first();
    if (stats.count() > 10 && stats.sample_std_dev() > threshold) {
        std::cout << "High variability detected!" << std::endl;
    }
}

Tutorial 4: Combining Accumulators

Accumulators from different sources can be merged:

// Process data in parallel threads
kbn_sum<double> sum1, sum2;

// Thread 1
for (size_t i = 0; i < n/2; ++i) {
    sum1 += data[i];
}

// Thread 2
for (size_t i = n/2; i < n; ++i) {
    sum2 += data[i];
}

// Combine results
sum1 += sum2;
double total = sum1.eval();

This also works for composed accumulators:

auto stats1 = kbn_sum<double>() + welford_accumulator<double>();
auto stats2 = kbn_sum<double>() + welford_accumulator<double>();

// Process different data chunks
for (size_t i = 0; i < n/2; ++i) stats1 += data[i];
for (size_t i = n/2; i < n; ++i) stats2 += data[i];

// Combine
stats1 += stats2;

Tutorial 5: Building and Testing

Build with CMake

git clone https://github.com/queelius/accumux.git
cd accumux
mkdir build && cd build
cmake -DACCUMUX_BUILD_TESTS=ON ..
make -j4

Run Tests

ctest

The test suite includes 183 tests covering core accumulators, compositions, and numerical accuracy.

Next Steps