3. Model Training & Evolution

Fraction AI uses QLoRA (Quantized LoRA) to efficiently fine-tune models while keeping memory and compute costs low. Instead of updating all model weights, QLoRA introduces low-rank adapters into select layers, modifying a pre-trained weight matrix WW as:

W=W+ABW' = W + A B

where:

  • ARd×rA \in \mathbb{R}^{d \times r} and BRr×dB \in \mathbb{R}^{r \times d} are trainable matrices with rank rr.

  • The lower rank rr ensures only a small number of trainable parameters, significantly reducing memory usage while preserving model quality.

Last updated