active
library
aptus
Resources & Distribution
aptus
Latin: fitted, adapted
REST API and Python client for remote LLM fine-tuning. Run the server on your GPU machine, submit training jobs from anywhere.
┌─────────────────────┐ ┌─────────────────────────────────┐
│ Your Laptop │ HTTP │ GPU Machine │
│ │ ───────► │ │
│ aptus client │ │ aptus-server │
│ (Python package) │ ◄─────── │ └── Unsloth backend │
│ │ │ └── SQLite job queue │
│ │ │ └── Adapter storage │
└─────────────────────┘ └─────────────────────────────────┘
Features
- REST API for fine-tuning jobs (create, monitor, cancel)
- Dataset upload endpoint for training data
- Background queue with progress tracking
- Unsloth backend for efficient QLoRA/LoRA training
- Mock backend for testing without GPU
- SQLite persistence for job state
- Lightweight client with no server dependencies
Installation
Client Only
pip install aptus
Server (on GPU machine)
pip install aptus[server]
# Install Unsloth separately (complex CUDA dependencies)
pip install unsloth
Quick Start
Server
aptus-server --port 8420
# Or with mock backend for testing
aptus-server --backend mock --port 8420
Client
from aptus import Client
client = Client("http://gpu-machine:8420")
# Upload dataset
dataset_id = client.upload_dataset("training_data.jsonl")
# Create fine-tuning job
job = client.create_job(
base_model="unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
dataset=dataset_id,
hyperparams={"epochs": 3, "lora_rank": 32}
)
# Wait for completion
job = client.wait_for_job(job.id)
print(f"Adapter saved to: {job.adapter_path}")
curl
# Upload dataset
curl -X POST http://localhost:8420/v1/datasets/upload \
-F "file=@data.jsonl"
# Create job
curl -X POST http://localhost:8420/v1/jobs \
-H "Content-Type: application/json" \
-d '{"base_model": "unsloth/Llama-3.2-3B-Instruct-bnb-4bit", "dataset": "<id>"}'
# Check status
curl http://localhost:8420/v1/jobs/<job_id>
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET | /v1/health | Server health and GPU status |
POST | /v1/datasets/upload | Upload training dataset |
GET | /v1/datasets/{id} | Get dataset info |
POST | /v1/jobs | Create fine-tuning job |
GET | /v1/jobs | List all jobs |
GET | /v1/jobs/{id} | Get job status |
DELETE | /v1/jobs/{id} | Cancel job |
GET | /v1/jobs/{id}/logs | Stream training logs |
GET | /v1/jobs/{id}/adapter | Download LoRA adapter |
GET | /v1/models | List recommended models |
Dataset Format
JSONL with instruction/output pairs:
{"instruction": "What is 2+2?", "output": "4"}
{"instruction": "Write a haiku", "output": "Code flows like water..."}
Or chat format:
{"messages": [{"role": "user", "content": "Hi"}, {"role": "assistant", "content": "Hello!"}]}
Configuration
# Force mock backend (for testing)
aptus-server --backend mock
# Custom port
aptus-server --port 9000
# Environment variable
APTUS_BACKEND=mock aptus-server
Storage
Data is stored in ~/.aptus/:
aptus.db- SQLite database for job statedatasets/{id}/- Uploaded training filesadapters/{id}/- Trained LoRA adapters
GPU Requirements
Optimized for RTX 3060 (12GB VRAM):
- Default method: QLoRA (4-bit quantization)
- Max model size: 7B parameters
- Recommended models:
unsloth/Llama-3.2-3B-Instruct-bnb-4bitunsloth/Mistral-7B-v0.3-bnb-4bitunsloth/Phi-3-mini-4k-instruct-bnb-4bit
License
MIT