QLoRA: Quantized Low-Rank Adaptation paper explained | by Astarag Mohapatra | Medium

Member-only story
QLoRA: Quantized Low-Rank Adaptation paper explained
Astarag Mohapatra
·Follow
6 min read·
Aug 7, 2023
--
Continuing my fine-tuning journey, you can find the first article on LoRA, let’s get into the QLoRA paper which was released on May 2023.
QLoRA: Efficient Finetuning of Quantized LLMsWe present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model…
arxiv.org
It is an upgrade to LoRA and it achieves the following objectives
4-bit NormalFloat (NF4): A new data type for normally distributed weights in a neural network. It is an improvement to standard quantization.
Quantization of the quantization constants to save memory.
Paged optimizers to manage memory spikes.
QUANTIZATION AND NORMAL-FLOATQuantization is the discretization of a continuous function. We discretize a continuous distribution into buckets of equal sizes, and the elements that fall inside a block are assigned the average value of the range of the bucket or rounded off.
Taken from this videoWe also have block-wise quantization, where we quantize a continuous distribution based on the number of bits. Suppose we want to quantize a continuous function in INT8, which has a range of 0 to +127 (unsigned…
--
--
Written by Astarag Mohapatra476 Followers
·24 Following
Hi Astarag here, I am interested in topics about Deep learning and other topics. If you have any queries I am one comment away
No responses yet
Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams