3. Model Training & Evolution
Fraction AI uses QLoRA (Quantized LoRA) to efficiently fine-tune models while keeping memory and compute costs low. Instead of updating all model weights, QLoRA introduces low-rank adapters into select layers, modifying a pre-trained weight matrix as:
where:
and are trainable matrices with rank .
The lower rank ensures only a small number of trainable parameters, significantly reducing memory usage while preserving model quality.
Last updated