Project Overview
ClinicLLM is a domain-specific medical language model built by running a full post-training pipeline on open clinical data, then served with multiple hot-swappable LoRA adapters behind a single base model.
Objective
Run synthetic-data generation, QLoRA SFT (Stage 1), and DPO preference alignment (Stage 2) on a clinical corpus, then deploy a multi-adapter serving endpoint.
Scope
- Synthetic instruction-data generation for clinical Q&A.
- QLoRA SFT (Stage 1) with a 4-bit NF4 base model.
- DPO preference alignment (Stage 2) over one epoch.
- Multi-adapter production serving with adapter routing by endpoint.
Datasets
- Clinical patient Q&A pairs for SFT.
- Clinical-reasoning QA pairs.
- Medical instruction corpora (e.g., Medical Meadow style).
- Base model: an 8B-class instruct model.
Stack
- Synthetic-data tooling (
distilabel-style). - Deduplication and chat formatting with loss masking.
transformers+ TRL (SFTTrainer,DPOTrainer).- Unsloth for VRAM reduction.
peftfor LoRA adapter management.bitsandbytesfor 4-bit NF4 quantization.
Evaluation
- Text-similarity metrics (ROUGE-L, BERTScore).
- LLM-as-Judge on medical accuracy, safety, and tone.
- Compare base vs. SFT vs. SFT+DPO.
Deliverables
- Cleaned SFT dataset and DPO preference pairs.
- Two trained adapters (Stage 1 + Stage 2).
- Evaluation report comparing all model variants.
- Live multi-adapter vLLM endpoint behind FastAPI on AWS.
Prerequisites
Modules 03–07 (fine-tuning lifecycle, datasets, adapters, alignment, evaluation/serving).