Menu
×
   ❮   
HTML CSS JAVASCRIPT SQL PYTHON JAVA PHP HOW TO W3.CSS C C++ C# BOOTSTRAP REACT MYSQL JQUERY EXCEL XML DJANGO NUMPY PANDAS NODEJS DSA TYPESCRIPT SWIFT ANGULAR ANGULARJS GIT POSTGRESQL MONGODB ASP AI R GO KOTLIN SWIFT SASS VUE GEN AI SCIPY AWS CYBERSECURITY DATA SCIENCE INTRO TO PROGRAMMING HTML & CSS BASH RUST

AWS Retrieval Augmented Generation (RAG)


Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) is a prompting technique that supplies domain-relevant data as context to produce responses based on that data and the prompt.

This technique is similar to fine-tuning.

However, instead of having to fine-tune a foundation model with a small set of labeled examples, you can use RAG to retrieve a small set of relevant documents from a large corpus and provide context to answer questions.

RAG doesn't change the weights of the foundation model, whereas fine-tuning changes model weights.


Retrieval Augmented Generation (RAG) illustration

This approach can be more cost-efficient than regular fine-tuning because the RAG approach doesn't incur the cost of fine-tuning a model.

RAG also addresses the challenge of frequent data changes because it retrieves updated and relevant information instead of relying on potentially outdated sets of data.

In RAG, the external data can come from multiple data sources, such as a document repository, databases, or APIs.

Before using RAG with LLMs you must prepare and keep the knowledge base updated.

The following diagram shows the conceptual flow of using RAG with LLMs. To see the steps the model uses to learn once the knowledge base has been prepared, choose each of the four numbered markers.

Retrieval Augmented Generation (RAG) illustration

User: Encode the input text using a language model like GPT-J or Amazon Titan Embeddings.

Knowledge base: Retrieve relevant examples from a knowledge base that matches the input. These examples are encoded in the same way.

FM:Provide the enhanced prompt with question and context to the foundation model to generate a response.

Answer:The generated response is conditioned on both the input and the retrieved examples, incorporating information from multiple relevant examples into the response.


×

Contact Sales

If you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail:
sales@w3schools.com

Report Error

If you want to report an error, or if you want to make a suggestion, send us an e-mail:
help@w3schools.com

W3Schools is optimized for learning and training. Examples might be simplified to improve reading and learning. Tutorials, references, and examples are constantly reviewed to avoid errors, but we cannot warrant full correctness of all content. While using W3Schools, you agree to have read and accepted our terms of use, cookies and privacy policy.

Copyright 1999-2026 by Refsnes Data. All Rights Reserved. W3Schools is Powered by W3.CSS.