Augmenting LLMs: Fine-Tuning or RAG?

The trade-offs between fine-tuning and RAG.

Over the last couple of weeks, we covered several details around vector databases, LLMs, and fine-tuning LLMs with LoRA. Moreover, we implemented LoRA from scratch, learned about RAG, its considerations, and much more.

If you are new here (or wish to recall), you can read this:

There’s one thing that’s yet to be addressed in this series of articles.

To recall, there are broadly two popular ways to augment LLMs with additional data:

  • RAG

  • Fine-tuning using LoRA/QLoRA

Both of them have pros and cons, and different applications.

The question is:

To continue this LLM series, I’m excited to bring you a special guest post by Damien Benveniste. He is the author of The AiEdge newsletter and was a Machine Learning Tech Lead at Meta.

Subscribe to Damien's The AiEdge newsletter for more. You can also follow him on LinkedIn and Twitter.

In today’s machine learning deep dive, he is providing a detailed discussion on RAG vs. Fine-tuning: Augmenting LLMs: Fine-Tuning or RAG?

More specifically, he explains the tradeoffs between:

  • RAG and fine-tuning

  • System design for RAG and fine-tuning pipelines

  • Cost measures of owning the model vs. using a third-party host.

  • Issues with RAG and fine-tuning.

I personally learned a lot from this one, and I am sure you will learn a lot too.

Have a good day!

Avi

Reply

or to participate.