Machine learningDeep Learning, Language Models, Parameter Efficient Fine-Tuning

QLoRA

QLoRA is an efficient fine-tuning method introduced by Dettmers et al. in 2023 that enables fine-tuning large language models using quantization and low-rank adaptation. By combining 4-bit quantization with LoRA, QLoRA reduces memory requirements by 75%, enabling fine-tuning of 65B-parameter models on single GPUs.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Dettmers, T., Pagnoni, A., Holtzman, A., & Contrastive, L. (2023). QLoRA: Efficient finetuning of quantized LLMs. arXiv preprint arXiv:2305.14314. link

Related methods

Referenced by

ScholarGateQLoRA (Efficient Finetuning of Quantized LLMs). Retrieved 2026-06-04 from https://scholargate.app/en/deep-learning/qlora