QLoRA: Quantized Low-Rank Adaptation paper explained
6 min readAug 7, 2023
Continuing my fine-tuning journey, you can find the first article on LoRA, let’s get into the QLoRA paper which was released on May 2023.
It is an upgrade to LoRA and it achieves the following objectives
- 4-bit NormalFloat (NF4): A new data type for normally distributed weights in a neural network. It is an improvement to standard quantization.
- Quantization of the quantization constants to save memory.
- Paged optimizers to manage memory spikes.
QUANTIZATION AND NORMAL-FLOAT
- Quantization is the discretization of a continuous function. We discretize a continuous distribution into buckets of equal sizes, and the elements that fall inside a block are assigned the average value of the range of the bucket or rounded off.
- We also have block-wise quantization, where we quantize a continuous distribution based on the number of bits. Suppose we want to quantize a continuous function in INT8, which has a range of 0 to +127 (unsigned…