Skip to main content

DreamLog: Logic Programming That Dreams to Improve Itself

DreamLog is a revolutionary logic programming system that learns continuously by alternating between wake and sleep phases—just like biological brains. During wake, it uses LLMs to generate missing knowledge; during sleep, it compresses what it knows into more general principles. This isn’t logic programming with LLMs bolted on—it’s a fundamentally new paradigm where knowledge representation evolves through use.

The Core Insight: Compression IS Learning

DreamLog operationalizes a profound insight from algorithmic information theory:

LearningCompression \text{Learning} \equiv \text{Compression}

This is Solomonoff induction—the mathematical formalization of Occam’s razor. The system that explains your data with the shortest program is the one most likely to generalize.

Formal statement: Given observations DD, the probability of hypothesis hh is:

P(hD)2K(h)1[h explains D] P(h|D) \propto 2^{-K(h)} \cdot \mathbb{1}[h \text{ explains } D]

where K(h)K(h) is the Kolmogorov complexity of hh.

For logic programming, hypotheses are sets of facts and rules. DreamLog’s sleep phase searches for minimal representations that preserve deductive closure:

minimize KB subject to Closure(KB)=Closure(KB) \text{minimize } |KB'| \text{ subject to } \text{Closure}(KB') = \text{Closure}(KB)

In plain terms: Find the shortest knowledge base that still derives all the same facts.

The Wake-Sleep Architecture

Wake Phase: Exploitation Through Generation

During wake, DreamLog operates as a traditional logic programming engine enhanced with LLM-based knowledge generation:

from dreamlog.pythonic import dreamlog

# Create knowledge base with LLM integration
kb = dreamlog(llm_provider="openai")

# Add some facts
kb.fact("parent", "john", "mary")
kb.fact("parent", "mary", "alice")

# Add a rule
kb.rule("grandparent", ["X", "Z"]) \
  .when("parent", ["X", "Y"]) \
  .and_("parent", ["Y", "Z"])

# Query
for result in kb.query("grandparent", "X", "alice"):
    print(f"{result.bindings['X']} is Alice's grandparent")  # john

The magic happens with undefined predicates:

# Query a predicate we never defined!
for result in kb.query("sibling", "X", "Y"):
    # LLM generates knowledge about siblings on-the-fly
    print(result)

When the evaluator encounters an undefined predicate, it triggers the LLM hook:

def on_undefined(term, evaluator):
    # Extract relevant context
    context = extract_relevant_knowledge(evaluator.kb, term)

    # Construct prompt
    prompt = construct_prompt(term, context)

    # Generate knowledge
    response = llm.generate(prompt)
    new_knowledge = parse_response(response)

    # Add to knowledge base
    evaluator.kb.add(new_knowledge)

Key properties:

  • Context-aware generation based on existing facts
  • Caching to avoid redundant generation
  • Retry logic with exponential backoff
  • Multiple provider support (OpenAI, Anthropic, Ollama)

Sleep Phase: Exploration Through Compression

During sleep, DreamLog reorganizes knowledge through compression operators:

1. Anti-Unification

Find general patterns from specific instances:

Before:
  (parent john mary)
  (parent bob alice)
  (parent jane charlie)

After:
  (parent X Y) :- (biological_parent X Y)

2. Predicate Invention

Discover intermediate concepts that simplify multiple rules:

Before:
  (grandparent X Z) :- (parent X Y), (parent Y Z)
  (great_grandparent X W) :- (parent X Y), (parent Y Z), (parent Z W)

After:
  (ancestor X Y 1) :- (parent X Y)
  (ancestor X Z N) :- (parent X Y), (ancestor Y Z M), (succ M N)
  (grandparent X Y) :- (ancestor X Y 2)
  (great_grandparent X W) :- (ancestor X W 3)

Why this matters: The new ancestor predicate captures the recursive structure that was hidden in the original rules. This is true abstraction—discovering compositional primitives.

3. Subsumption Elimination

Remove specific rules subsumed by general ones:

Before:
  (animal X) :- (mammal X)
  (animal X) :- (dog X)  ← Redundant! Dogs are mammals.

After:
  (animal X) :- (mammal X)

4. Sleep Cycle API

from dreamlog.kb_dreamer import KnowledgeBaseDreamer

# Initialize dreamer
dreamer = KnowledgeBaseDreamer(kb.provider)

# Execute sleep cycle
session = dreamer.dream(
    kb,
    dream_cycles=3,            # Multiple REM cycles
    exploration_samples=10,     # Try different optimizations
    verify=True                # Ensure behavior preservation
)

# Results
print(f"Compression: {session.compression_ratio:.1%}")         # 23.5%
print(f"Generalization: {session.generalization_score:.2f}")   # 0.87
print(f"Rules discovered: {len(session.new_predicates)}")      # 3

The Dual Knowledge Base System

DreamLog maintains two separate knowledge bases to distinguish learned from authoritative knowledge:

from dreamlog.persistent_learning import PersistentKnowledgeBase

kb = PersistentKnowledgeBase(storage_path)

# KB_1: Learned knowledge (from LLM)
kb.add_learned_knowledge(facts, rules)

# KB_2: User knowledge (ground truth)
conflicts = kb.add_user_knowledge(user_facts, user_rules)

# Resolve conflicts (user knowledge takes precedence)
kb.resolve_conflicts(UserTrustStrategy())

# Query combined knowledge
solutions = kb.query_with_tracking(goals)

Why dual KBs?:

  1. Conflict detection: Automatically detect when LLM-generated knowledge contradicts user assertions
  2. Trust hierarchy: User knowledge always wins in conflicts
  3. Isolation: Can inspect/modify learned knowledge without affecting ground truth
  4. Rollback: Can revert learned knowledge while preserving user assertions

Conflict Resolution Strategies

from dreamlog.persistent_learning import (
    UserTrustStrategy,      # User always wins
    ConservativeStrategy,   # Keep both, mark conflicts
    MajorityVoteStrategy    # Consensus from multiple LLMs
)

# Example: Conservative approach
kb.resolve_conflicts(ConservativeStrategy())
# Result: Conflicting facts coexist with metadata marking the conflict

Validation and Verification

The sleep phase maintains semantic equivalence through rigorous verification:

1. Deductive Closure Preservation

from dreamlog.knowledge_validator import KnowledgeValidator

validator = KnowledgeValidator()

# Test that compressed KB derives all original facts
validator.add_test(ConsistencyTest(original_facts))

# Validate
report = validator.validate(compressed_kb)
print(f"Success rate: {report.success_rate:.1f}%")

2. Query Equivalence Testing

# Generate test queries
test_queries = SampleQueryGenerator(kb).generate(n=100)

# Verify both KBs produce identical results
for query in test_queries:
    original_results = original_kb.query(query)
    compressed_results = compressed_kb.query(query)
    assert set(original_results) == set(compressed_results)

3. Incremental Verification

Each transformation is checked independently:

# Before applying transformation
snapshot = kb.snapshot()

# Apply compression operator
transform(kb)

# Verify preservation
if not verify_equivalence(snapshot, kb):
    kb.restore(snapshot)  # Rollback

Sleep Cycle Scheduling

Different sleep phases run at different intervals:

from dreamlog.sleep_cycle import SleepCycleManager, SleepPhase
from datetime import timedelta

manager = SleepCycleManager(
    kb,
    light_sleep_interval=timedelta(minutes=30),   # Basic cleanup
    deep_sleep_interval=timedelta(hours=4),       # Rule generalization
    rem_sleep_interval=timedelta(hours=8),        # Creative hypothesis generation
    min_idle_time=timedelta(minutes=5)            # Must be idle before sleeping
)

# Start background sleep cycles
manager.start_background_sleep()

# Or force immediate cycle
report = manager.force_sleep_cycle(SleepPhase.DEEP_SLEEP)

Sleep phases:

  • Light sleep: Duplicate removal, basic optimization
  • Deep sleep: Rule generalization, subsumption elimination
  • REM sleep: Creative hypothesis generation (future enhancement)

Background Learning Service

For long-running applications, DreamLog provides a background service:

from dreamlog.background_learner import BackgroundLearner, BackgroundLearnerClient

# Server side
service = BackgroundLearner(storage_path, ipc_port=7777)
service.start()

# Client side
client = BackgroundLearnerClient(port=7777)

# Add knowledge remotely
result = client.add_user_knowledge(facts, rules)

# Query remotely
solutions = client.query(goals)

# Control sleep cycles remotely
report = client.force_sleep_cycle(SleepPhase.LIGHT_SLEEP)

# Get metrics
status = client.get_status()
print(f"Total facts: {status.total_facts}")
print(f"Compression ratio: {status.compression_ratio:.1%}")

Features:

  • TCP-based inter-process communication
  • Multiple concurrent clients
  • Real-time status and metrics
  • Remote sleep cycle control
  • Automatic session management

The High-Level Learning API

For production use, DreamLog provides a safe, high-level API:

from dreamlog.learning_api import LearningAPI, InjectionMode, LearningMode

api = LearningAPI(
    storage_path,
    learning_mode=LearningMode.ACTIVE,          # ACTIVE, PASSIVE, VALIDATION, EXPERIMENTAL
    injection_mode=InjectionMode.PERMISSIVE,    # STRICT, PERMISSIVE, INTERACTIVE, BATCH
    use_background_service=True
)

# Inject knowledge from strings
result = api.inject_knowledge_from_strings(
    fact_strings=["parent(john, mary)", "parent(mary, alice)"],
    rule_strings=["grandparent(X,Z) :- parent(X,Y), parent(Y,Z)"]
)

# Handle conflicts
if result.conflicts:
    print(f"Found {len(result.conflicts)} conflicts")
    for conflict in result.conflicts:
        print(f"  {conflict.description}")

# Query
solutions = api.query_knowledge("grandparent(john, X)")

# Monitor system
metrics = api.get_system_metrics()
print(f"Knowledge base size: {metrics.kb_size}")
print(f"Queries per second: {metrics.query_throughput:.1f}")

Injection modes:

  • STRICT: Reject on any conflict or validation failure
  • PERMISSIVE: Auto-resolve conflicts, continue on warnings
  • INTERACTIVE: User-guided conflict resolution
  • BATCH: Accumulate changes, resolve at end

Convergence Properties

Theorem: Under mild assumptions, the sleep phase compression converges to a local minimum in description length.

Proof sketch: The compression operators (anti-unification, subsumption, predicate invention) monotonically decrease description length while preserving deductive closure. Since description length is bounded below by the Kolmogorov complexity of the knowledge, the process must converge.

Mathematical property:

Let Cn be compression at cycle n \text{Let } C_n \text{ be compression at cycle } n

C0>C1>C2>>CK(KB) C_0 > C_1 > C_2 > \cdots > C_\infty \geq K(KB)

where K(KB)K(KB) is the Kolmogorov complexity of the knowledge base.

Theoretical Foundations

LLMs as Approximate Oracles

We model the LLM as an approximate oracle:

O:TP(KB) \mathcal{O}: \mathcal{T} \rightarrow \mathcal{P}(\mathcal{KB})

that maps terms to probability distributions over knowledge base fragments.

Key insight: LLMs approximate the universal prior in Solomonoff induction, but biased by human-generated text rather than pure algorithmic simplicity.

Exploration-Exploitation Trade-off

Wake-sleep cycles implement the fundamental exploration-exploitation dilemma:

PhaseObjectiveStrategy
WakeMaximize query answeringExploit existing knowledge + generate as needed
SleepMinimize description lengthExplore alternative representations

This is analogous to:

  • Reinforcement learning: Exploration vs. exploitation
  • Simulated annealing: Temperature-based optimization
  • Evolutionary algorithms: Mutation vs. selection

Performance Analysis

Compression Ratios

Initial experiments show:

  • 20-40% compression without semantic loss
  • Greater compression possible with bounded semantic drift
  • Diminishing returns after multiple cycles (convergence)

LLM Knowledge Quality

Analysis of generated knowledge:

  • >85% accuracy for common-sense predicates
  • Degraded performance for specialized domains
  • Interesting hallucinations occasionally lead to useful abstractions

Wake-Sleep Cycle Timing

Optimal cycle timing follows a power law:

  • Frequent short cycles for rapidly changing domains
  • Longer cycles for stable knowledge bases
  • Adaptive scheduling based on knowledge base entropy

Real-World Example: Family Relationships

from dreamlog.pythonic import dreamlog
from dreamlog.kb_dreamer import KnowledgeBaseDreamer

# Create KB with LLM integration
kb = dreamlog(llm_provider="openai")

# Add minimal facts
kb.parse("""
(parent john mary)
(parent mary alice)
(parent john bob)
(parent bob charlie)
(parent jane mary)
(parent jane bob)
""")

# Query undefined predicates - LLM fills in knowledge
print("Siblings (LLM-generated):")
for result in kb.query("sibling", "X", "Y"):
    print(f"  {result.bindings['X']} and {result.bindings['Y']}")

print("\nUncles (LLM-generated):")
for result in kb.query("uncle", "X", "Y"):
    print(f"  {result.bindings['X']} is uncle of {result.bindings['Y']}")

# Now run sleep cycle to compress
dreamer = KnowledgeBaseDreamer(kb.provider)
session = dreamer.dream(kb, dream_cycles=3, verify=True)

print(f"\nCompression achieved: {session.compression_ratio:.1%}")
print(f"New predicates discovered: {session.new_predicates}")

# Example discovered predicate
# (ancestor X Y 1) :- (parent X Y)
# (ancestor X Z N) :- (parent X Y), (ancestor Y Z M), (succ M N)
# (uncle X Y) :- (sibling X Z), (parent Z Y), (male X)
SystemApproachContinuous LearningLLM IntegrationCompression
DreamLogWake-sleep cycles
PrologTraditional logic
DatalogBottom-up evaluation
ILP (FOIL, Progol)Batch learningLimited
Neural Theorem ProversEnd-to-end neural
∂ILPDifferentiable ILPLimited
DreamCoderProgram synthesis

DreamLog’s niche: Wake-sleep architecture + LLM integration + compression-based learning + continuous improvement.

Design Philosophy

🧠 Biological Inspiration: Memory consolidation during sleep isn’t metaphor—it’s architectural principle

📐 Mathematical Rigor: Grounded in algorithmic information theory (Solomonoff, Kolmogorov)

🔄 Continuous Learning: Knowledge evolves through use, not just batch training

🎯 Compositional Abstraction: Discover primitives that compose naturally

🛡️ Safe by Default: Verification ensures compression preserves behavior

🧩 Interpretable: Every rule and fact is inspectable, unlike black-box neural systems

Future Directions

1. Probabilistic Logic Programming

Extend to handle uncertainty explicitly:

P(bird(X)flies(X))=0.95 P(\text{bird}(X) \rightarrow \text{flies}(X)) = 0.95

2. Multi-Modal Knowledge

Incorporate visual and auditory predicates:

kb.fact("visual_features", "cat_image_1", [fur_texture, whiskers, pointed_ears])
kb.query("animal_type", "X") # Uses visual features

3. Adversarial Dreaming

Use GANs to generate challenging test cases during sleep:

# Generate edge cases that stress existing rules
adversarial_cases = dream_adversarially(kb)

4. Federated Learning

Multiple DreamLog instances share compressed knowledge:

# Node 1 learns about animals
# Node 2 learns about vehicles
# Exchange compressed abstractions
federated_kb = merge_knowledge_bases([node1, node2])

Philosophical Implications

DreamLog raises fundamental questions:

Is All Learning Compression?

If intelligence is compression of experience into principles, then sleep isn’t waste—it’s when learning actually happens.

Can Symbols Emerge from Neural Substrates?

If compression discovers symbols (like ancestor from parent chains), do symbols emerge naturally from statistical patterns?

The Role of Sleep in Cognition

Does biological sleep serve a similar compression function? DreamLog suggests sleep isn’t just cleanup—it’s creative reorganization.

Quick Start

# Install from PyPI
pip install dreamlog

# Or from source
git clone https://github.com/queelius/dreamlog.git
cd dreamlog
pip install -e .

Basic usage:

from dreamlog.pythonic import dreamlog

# Create KB
kb = dreamlog(llm_provider="openai")

# Add facts using S-expressions
kb.parse("""
(parent john mary)
(parent mary alice)
""")

# Add rules
kb.parse("""
(grandparent X Z) :- (parent X Y), (parent Y Z)
""")

# Query
for result in kb.query("grandparent", "john", "X"):
    print(result.bindings['X'])  # alice

With sleep cycles:

from dreamlog.kb_dreamer import KnowledgeBaseDreamer

dreamer = KnowledgeBaseDreamer(kb.provider)
session = dreamer.dream(kb, dream_cycles=3, verify=True)

print(f"Compression: {session.compression_ratio:.1%}")

Resources

License

MIT


DreamLog: Where reasoning systems sleep, perchance to dream—and wake up smarter. Compression as learning, sleep as optimization, and LLMs as generative priors for a continuously improving logic programming system.

Discussion