DreamLog is a revolutionary logic programming system that learns continuously by alternating between wake and sleep phases—just like biological brains. During wake, it uses LLMs to generate missing knowledge; during sleep, it compresses what it knows into more general principles. This isn’t logic programming with LLMs bolted on—it’s a fundamentally new paradigm where knowledge representation evolves through use.
The Core Insight: Compression IS Learning
DreamLog operationalizes a profound insight from algorithmic information theory:
This is Solomonoff induction—the mathematical formalization of Occam’s razor. The system that explains your data with the shortest program is the one most likely to generalize.
Formal statement: Given observations , the probability of hypothesis is:
where is the Kolmogorov complexity of .
For logic programming, hypotheses are sets of facts and rules. DreamLog’s sleep phase searches for minimal representations that preserve deductive closure:
In plain terms: Find the shortest knowledge base that still derives all the same facts.
The Wake-Sleep Architecture
Wake Phase: Exploitation Through Generation
During wake, DreamLog operates as a traditional logic programming engine enhanced with LLM-based knowledge generation:
from dreamlog.pythonic import dreamlog
# Create knowledge base with LLM integration
kb = dreamlog(llm_provider="openai")
# Add some facts
kb.fact("parent", "john", "mary")
kb.fact("parent", "mary", "alice")
# Add a rule
kb.rule("grandparent", ["X", "Z"]) \
.when("parent", ["X", "Y"]) \
.and_("parent", ["Y", "Z"])
# Query
for result in kb.query("grandparent", "X", "alice"):
print(f"{result.bindings['X']} is Alice's grandparent") # john
The magic happens with undefined predicates:
# Query a predicate we never defined!
for result in kb.query("sibling", "X", "Y"):
# LLM generates knowledge about siblings on-the-fly
print(result)
When the evaluator encounters an undefined predicate, it triggers the LLM hook:
def on_undefined(term, evaluator):
# Extract relevant context
context = extract_relevant_knowledge(evaluator.kb, term)
# Construct prompt
prompt = construct_prompt(term, context)
# Generate knowledge
response = llm.generate(prompt)
new_knowledge = parse_response(response)
# Add to knowledge base
evaluator.kb.add(new_knowledge)
Key properties:
- Context-aware generation based on existing facts
- Caching to avoid redundant generation
- Retry logic with exponential backoff
- Multiple provider support (OpenAI, Anthropic, Ollama)
Sleep Phase: Exploration Through Compression
During sleep, DreamLog reorganizes knowledge through compression operators:
1. Anti-Unification
Find general patterns from specific instances:
Before:
(parent john mary)
(parent bob alice)
(parent jane charlie)
After:
(parent X Y) :- (biological_parent X Y)
2. Predicate Invention
Discover intermediate concepts that simplify multiple rules:
Before:
(grandparent X Z) :- (parent X Y), (parent Y Z)
(great_grandparent X W) :- (parent X Y), (parent Y Z), (parent Z W)
After:
(ancestor X Y 1) :- (parent X Y)
(ancestor X Z N) :- (parent X Y), (ancestor Y Z M), (succ M N)
(grandparent X Y) :- (ancestor X Y 2)
(great_grandparent X W) :- (ancestor X W 3)
Why this matters: The new ancestor predicate captures the recursive structure that was hidden in the original rules. This is true abstraction—discovering compositional primitives.
3. Subsumption Elimination
Remove specific rules subsumed by general ones:
Before:
(animal X) :- (mammal X)
(animal X) :- (dog X) ← Redundant! Dogs are mammals.
After:
(animal X) :- (mammal X)
4. Sleep Cycle API
from dreamlog.kb_dreamer import KnowledgeBaseDreamer
# Initialize dreamer
dreamer = KnowledgeBaseDreamer(kb.provider)
# Execute sleep cycle
session = dreamer.dream(
kb,
dream_cycles=3, # Multiple REM cycles
exploration_samples=10, # Try different optimizations
verify=True # Ensure behavior preservation
)
# Results
print(f"Compression: {session.compression_ratio:.1%}") # 23.5%
print(f"Generalization: {session.generalization_score:.2f}") # 0.87
print(f"Rules discovered: {len(session.new_predicates)}") # 3
The Dual Knowledge Base System
DreamLog maintains two separate knowledge bases to distinguish learned from authoritative knowledge:
from dreamlog.persistent_learning import PersistentKnowledgeBase
kb = PersistentKnowledgeBase(storage_path)
# KB_1: Learned knowledge (from LLM)
kb.add_learned_knowledge(facts, rules)
# KB_2: User knowledge (ground truth)
conflicts = kb.add_user_knowledge(user_facts, user_rules)
# Resolve conflicts (user knowledge takes precedence)
kb.resolve_conflicts(UserTrustStrategy())
# Query combined knowledge
solutions = kb.query_with_tracking(goals)
Why dual KBs?:
- Conflict detection: Automatically detect when LLM-generated knowledge contradicts user assertions
- Trust hierarchy: User knowledge always wins in conflicts
- Isolation: Can inspect/modify learned knowledge without affecting ground truth
- Rollback: Can revert learned knowledge while preserving user assertions
Conflict Resolution Strategies
from dreamlog.persistent_learning import (
UserTrustStrategy, # User always wins
ConservativeStrategy, # Keep both, mark conflicts
MajorityVoteStrategy # Consensus from multiple LLMs
)
# Example: Conservative approach
kb.resolve_conflicts(ConservativeStrategy())
# Result: Conflicting facts coexist with metadata marking the conflict
Validation and Verification
The sleep phase maintains semantic equivalence through rigorous verification:
1. Deductive Closure Preservation
from dreamlog.knowledge_validator import KnowledgeValidator
validator = KnowledgeValidator()
# Test that compressed KB derives all original facts
validator.add_test(ConsistencyTest(original_facts))
# Validate
report = validator.validate(compressed_kb)
print(f"Success rate: {report.success_rate:.1f}%")
2. Query Equivalence Testing
# Generate test queries
test_queries = SampleQueryGenerator(kb).generate(n=100)
# Verify both KBs produce identical results
for query in test_queries:
original_results = original_kb.query(query)
compressed_results = compressed_kb.query(query)
assert set(original_results) == set(compressed_results)
3. Incremental Verification
Each transformation is checked independently:
# Before applying transformation
snapshot = kb.snapshot()
# Apply compression operator
transform(kb)
# Verify preservation
if not verify_equivalence(snapshot, kb):
kb.restore(snapshot) # Rollback
Sleep Cycle Scheduling
Different sleep phases run at different intervals:
from dreamlog.sleep_cycle import SleepCycleManager, SleepPhase
from datetime import timedelta
manager = SleepCycleManager(
kb,
light_sleep_interval=timedelta(minutes=30), # Basic cleanup
deep_sleep_interval=timedelta(hours=4), # Rule generalization
rem_sleep_interval=timedelta(hours=8), # Creative hypothesis generation
min_idle_time=timedelta(minutes=5) # Must be idle before sleeping
)
# Start background sleep cycles
manager.start_background_sleep()
# Or force immediate cycle
report = manager.force_sleep_cycle(SleepPhase.DEEP_SLEEP)
Sleep phases:
- Light sleep: Duplicate removal, basic optimization
- Deep sleep: Rule generalization, subsumption elimination
- REM sleep: Creative hypothesis generation (future enhancement)
Background Learning Service
For long-running applications, DreamLog provides a background service:
from dreamlog.background_learner import BackgroundLearner, BackgroundLearnerClient
# Server side
service = BackgroundLearner(storage_path, ipc_port=7777)
service.start()
# Client side
client = BackgroundLearnerClient(port=7777)
# Add knowledge remotely
result = client.add_user_knowledge(facts, rules)
# Query remotely
solutions = client.query(goals)
# Control sleep cycles remotely
report = client.force_sleep_cycle(SleepPhase.LIGHT_SLEEP)
# Get metrics
status = client.get_status()
print(f"Total facts: {status.total_facts}")
print(f"Compression ratio: {status.compression_ratio:.1%}")
Features:
- TCP-based inter-process communication
- Multiple concurrent clients
- Real-time status and metrics
- Remote sleep cycle control
- Automatic session management
The High-Level Learning API
For production use, DreamLog provides a safe, high-level API:
from dreamlog.learning_api import LearningAPI, InjectionMode, LearningMode
api = LearningAPI(
storage_path,
learning_mode=LearningMode.ACTIVE, # ACTIVE, PASSIVE, VALIDATION, EXPERIMENTAL
injection_mode=InjectionMode.PERMISSIVE, # STRICT, PERMISSIVE, INTERACTIVE, BATCH
use_background_service=True
)
# Inject knowledge from strings
result = api.inject_knowledge_from_strings(
fact_strings=["parent(john, mary)", "parent(mary, alice)"],
rule_strings=["grandparent(X,Z) :- parent(X,Y), parent(Y,Z)"]
)
# Handle conflicts
if result.conflicts:
print(f"Found {len(result.conflicts)} conflicts")
for conflict in result.conflicts:
print(f" {conflict.description}")
# Query
solutions = api.query_knowledge("grandparent(john, X)")
# Monitor system
metrics = api.get_system_metrics()
print(f"Knowledge base size: {metrics.kb_size}")
print(f"Queries per second: {metrics.query_throughput:.1f}")
Injection modes:
- STRICT: Reject on any conflict or validation failure
- PERMISSIVE: Auto-resolve conflicts, continue on warnings
- INTERACTIVE: User-guided conflict resolution
- BATCH: Accumulate changes, resolve at end
Convergence Properties
Theorem: Under mild assumptions, the sleep phase compression converges to a local minimum in description length.
Proof sketch: The compression operators (anti-unification, subsumption, predicate invention) monotonically decrease description length while preserving deductive closure. Since description length is bounded below by the Kolmogorov complexity of the knowledge, the process must converge.
Mathematical property:
where is the Kolmogorov complexity of the knowledge base.
Theoretical Foundations
LLMs as Approximate Oracles
We model the LLM as an approximate oracle:
that maps terms to probability distributions over knowledge base fragments.
Key insight: LLMs approximate the universal prior in Solomonoff induction, but biased by human-generated text rather than pure algorithmic simplicity.
Exploration-Exploitation Trade-off
Wake-sleep cycles implement the fundamental exploration-exploitation dilemma:
| Phase | Objective | Strategy |
|---|---|---|
| Wake | Maximize query answering | Exploit existing knowledge + generate as needed |
| Sleep | Minimize description length | Explore alternative representations |
This is analogous to:
- Reinforcement learning: Exploration vs. exploitation
- Simulated annealing: Temperature-based optimization
- Evolutionary algorithms: Mutation vs. selection
Performance Analysis
Compression Ratios
Initial experiments show:
- 20-40% compression without semantic loss
- Greater compression possible with bounded semantic drift
- Diminishing returns after multiple cycles (convergence)
LLM Knowledge Quality
Analysis of generated knowledge:
- >85% accuracy for common-sense predicates
- Degraded performance for specialized domains
- Interesting hallucinations occasionally lead to useful abstractions
Wake-Sleep Cycle Timing
Optimal cycle timing follows a power law:
- Frequent short cycles for rapidly changing domains
- Longer cycles for stable knowledge bases
- Adaptive scheduling based on knowledge base entropy
Real-World Example: Family Relationships
from dreamlog.pythonic import dreamlog
from dreamlog.kb_dreamer import KnowledgeBaseDreamer
# Create KB with LLM integration
kb = dreamlog(llm_provider="openai")
# Add minimal facts
kb.parse("""
(parent john mary)
(parent mary alice)
(parent john bob)
(parent bob charlie)
(parent jane mary)
(parent jane bob)
""")
# Query undefined predicates - LLM fills in knowledge
print("Siblings (LLM-generated):")
for result in kb.query("sibling", "X", "Y"):
print(f" {result.bindings['X']} and {result.bindings['Y']}")
print("\nUncles (LLM-generated):")
for result in kb.query("uncle", "X", "Y"):
print(f" {result.bindings['X']} is uncle of {result.bindings['Y']}")
# Now run sleep cycle to compress
dreamer = KnowledgeBaseDreamer(kb.provider)
session = dreamer.dream(kb, dream_cycles=3, verify=True)
print(f"\nCompression achieved: {session.compression_ratio:.1%}")
print(f"New predicates discovered: {session.new_predicates}")
# Example discovered predicate
# (ancestor X Y 1) :- (parent X Y)
# (ancestor X Z N) :- (parent X Y), (ancestor Y Z M), (succ M N)
# (uncle X Y) :- (sibling X Z), (parent Z Y), (male X)
Comparison with Related Systems
| System | Approach | Continuous Learning | LLM Integration | Compression |
|---|---|---|---|---|
| DreamLog | Wake-sleep cycles | ✅ | ✅ | ✅ |
| Prolog | Traditional logic | ❌ | ❌ | ❌ |
| Datalog | Bottom-up evaluation | ❌ | ❌ | ❌ |
| ILP (FOIL, Progol) | Batch learning | ❌ | ❌ | Limited |
| Neural Theorem Provers | End-to-end neural | ✅ | ✅ | ❌ |
| ∂ILP | Differentiable ILP | Limited | ❌ | ❌ |
| DreamCoder | Program synthesis | ✅ | ❌ | ✅ |
DreamLog’s niche: Wake-sleep architecture + LLM integration + compression-based learning + continuous improvement.
Design Philosophy
🧠 Biological Inspiration: Memory consolidation during sleep isn’t metaphor—it’s architectural principle
📐 Mathematical Rigor: Grounded in algorithmic information theory (Solomonoff, Kolmogorov)
🔄 Continuous Learning: Knowledge evolves through use, not just batch training
🎯 Compositional Abstraction: Discover primitives that compose naturally
🛡️ Safe by Default: Verification ensures compression preserves behavior
🧩 Interpretable: Every rule and fact is inspectable, unlike black-box neural systems
Future Directions
1. Probabilistic Logic Programming
Extend to handle uncertainty explicitly:
2. Multi-Modal Knowledge
Incorporate visual and auditory predicates:
kb.fact("visual_features", "cat_image_1", [fur_texture, whiskers, pointed_ears])
kb.query("animal_type", "X") # Uses visual features
3. Adversarial Dreaming
Use GANs to generate challenging test cases during sleep:
# Generate edge cases that stress existing rules
adversarial_cases = dream_adversarially(kb)
4. Federated Learning
Multiple DreamLog instances share compressed knowledge:
# Node 1 learns about animals
# Node 2 learns about vehicles
# Exchange compressed abstractions
federated_kb = merge_knowledge_bases([node1, node2])
Philosophical Implications
DreamLog raises fundamental questions:
Is All Learning Compression?
If intelligence is compression of experience into principles, then sleep isn’t waste—it’s when learning actually happens.
Can Symbols Emerge from Neural Substrates?
If compression discovers symbols (like ancestor from parent chains), do symbols emerge naturally from statistical patterns?
The Role of Sleep in Cognition
Does biological sleep serve a similar compression function? DreamLog suggests sleep isn’t just cleanup—it’s creative reorganization.
Quick Start
# Install from PyPI
pip install dreamlog
# Or from source
git clone https://github.com/queelius/dreamlog.git
cd dreamlog
pip install -e .
Basic usage:
from dreamlog.pythonic import dreamlog
# Create KB
kb = dreamlog(llm_provider="openai")
# Add facts using S-expressions
kb.parse("""
(parent john mary)
(parent mary alice)
""")
# Add rules
kb.parse("""
(grandparent X Z) :- (parent X Y), (parent Y Z)
""")
# Query
for result in kb.query("grandparent", "john", "X"):
print(result.bindings['X']) # alice
With sleep cycles:
from dreamlog.kb_dreamer import KnowledgeBaseDreamer
dreamer = KnowledgeBaseDreamer(kb.provider)
session = dreamer.dream(kb, dream_cycles=3, verify=True)
print(f"Compression: {session.compression_ratio:.1%}")
Resources
- Repository: github.com/queelius/dreamlog
- Academic Analysis: DREAMLOG_ACADEMIC_ANALYSIS.md
- Persistent Learning: PERSISTENT_LEARNING.md
- Examples: See
examples/directory - Documentation: See
docs/directory
License
MIT
DreamLog: Where reasoning systems sleep, perchance to dream—and wake up smarter. Compression as learning, sleep as optimization, and LLMs as generative priors for a continuously improving logic programming system.
Discussion