Menu
×
   ❮   
HTML CSS JAVASCRIPT SQL PYTHON JAVA PHP HOW TO W3.CSS C C++ C# BOOTSTRAP REACT MYSQL JQUERY EXCEL XML DJANGO NUMPY PANDAS NODEJS DSA TYPESCRIPT SWIFT ANGULAR ANGULARJS GIT POSTGRESQL MONGODB ASP AI R GO KOTLIN SWIFT SASS VUE GEN AI SCIPY AWS CYBERSECURITY DATA SCIENCE INTRO TO PROGRAMMING HTML & CSS BASH RUST

AWS Foundation Model Types


Model Types

You can categorize FMs into multiple categories.

Two of the most frequently used models are text-to-text models and text-to-image models.

In this lesson, you learn more about each of these types of models.


Text-to-text models

Text-to-text models

Text-to-text models are large language models (LLMs) that are pretrained to process vast quantities of textual data and human language.

These large foundation models can summarize text, extract information, respond to questions, create content (such as blogs or product descriptions), and more.


Natural Language Processing (NLP)

Natural Language Processing (NLP)

NLP is a machine learning technology that gives machines the ability to interpret and manipulate human language.

NLP does this by analyzing the data, intent, or sentiment in the message and responding to human communication.

Typically, NLP implementation begins by gathering and preparing unstructured text or speech data from different sources and processing the data.

It uses techniques such as tokenization, stemming, lemmatization, stop word removal, part-of-speech tagging, named entity recognition, speech recognition, sentiment analysis, and so on.

However, modern LLMs don't require using these intermediate steps.


Recurrent neural network (RNN)

Recurrent neural network (RNN)

RNNs use a memory mechanism to store and apply data from previous inputs.

This mechanism makes RNNs effective for sequential data and tasks, such as natural language processing, speech recognition, or machine translation.

However, RNNs also have limitations.

They're slow and complex to train, and they can't support training parallelization.


Transformer

Transformer

A transformer is a deep-learning architecture that has an encoder component that converts the input text into embeddings.

It also has a decoder component that consumes the embeddings to emit some output text.

Unlike RNNs, transformers are extremely parallelizable, which means instead of processing text words one at a time during the learning cycle, transformers process input all at the same time.

It takes transformers significantly less time to train, but they require more computing power to speed training.

The transformer architecture was the key to the development of LLMs.

These days, most LLMs contain a decoder component.


Text-to-image models

Transformer

Text-to-image models take natural language input and produce a high-quality image that matches the input text description.

Some examples of text-to-image models are DALL-E 2 from OpenAI, Imagen from the Google Research Brain Team, Stable Diffusion from Stability AI, and Midjourney.

To learn more about text-to-image models, review the following slide.


Diffusion architecture

Diffusion architecture

Diffusion is a deep learning architecture system that learns through a two-step process.

The first step is called forward diffusion.

Using forward diffusion, the system gradually introduces a small amount of noise to an input image until the noise is leftover.

There is a U-Net model that tracks and predicts the noise level.

In the subsequent reverse diffusion step, the noisy image is gradually introduced to denoising until a new image is generated.

During the training process, the model gets the feed of text, which is added to the image vector.


Large language models

Large language models are a subset of foundation models.

LLMs are trained on trillions of words across many natural language tasks.

LLMs can understand, learn, and generate text that’s nearly indistinguishable from text produced by humans.

LLMs can also engage in interactive conversations, answer questions, summarize dialogues and documents, and provide recommendations.

Because of their sheer size and AI acceleration, LLMs can process vast amounts of textual data.

LLMs have a wide range of capabilities, such as creative writing for marketing, summarizing legal documents, preparing market research for financial teams, simulating clinical trials for healthcare, and writing code for software development.


Neural network layers

Transformer models effectively process natural language because they use neural networks to understand the nuances of human language.

These neural networks are computing systems modeled after the human brain.

Multiple layers of neural networks in a single LLM work together to process your input and generate output.


Embedding layer

The embedding layer converts your input text to vector representations called embeddings.

This layer can capture complex relationships between the embeddings, so you can understand the context of your input text.


Feedforward layer

The feedforward layer contains several connected layers that transform the embeddings into more weighted versions of themselves.

This layer continues to contextualize the language and helps you understand the input text's intent better.


Attention mechanism

With the attention mechanism, you can focus on the most relevant parts of the input text.

This mechanism, a central part of the transformer model, helps you achieve more accurate output results.


LLM use cases

LLMs have diverse applications:

  • Automate customer service with chatbots and virtual assistants
  • Draft articles, blogs, and marketing materials
  • Generate, review, and explain code for developers
  • Summarize patient records and clinical documentation
  • Translate documents and real-time conversations
  • Analyze and summarize reports, legal documents, and financial statements
  • Support creative writing like poetry and storytelling
  • Enable personalized recommendations in e-commerce, education, and media

The following four categories highlight key areas where LLMs deliver value:

LLM use cases 1 2 3 4

Image created by Amazon Web Services.

1. Improve customer experiences

  • Chatbots and virtual assistants - Automate responses, provide instant answers and support.
  • Call analytics - Extract insights from contact center calls to boost loyalty.
  • Agent assist - AI tools support human agents in problem solving and decision-making.

2. Boost employee productivity

  • Conversational search - Quickly find and summarize information through natural language.
  • Code generation - Accelerate development with code suggestions.
  • Automated report generation - Generate financial reports and projections automatically.

3. Enhance creativity and content creation

  • Marketing - Create blog posts, social media updates, email newsletters.
  • Sales - Generate personalized emails and sales scripts.
  • Product development - Generate design prototypes and optimize based on feedback.
  • Media and entertainment - Create scripts and dialogues for movies, TV, games.
  • News generation - Generate articles and summaries from raw data.

4. Accelerate process optimization

  • Document processing - Extract and summarize data from documents.
  • Fraud detection - Learn fraud patterns to train robust detection systems.
  • Supply chain optimization - Evaluate scenarios to improve logistics and reduce costs.

×

Contact Sales

If you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail:
sales@w3schools.com

Report Error

If you want to report an error, or if you want to make a suggestion, send us an e-mail:
help@w3schools.com

W3Schools is optimized for learning and training. Examples might be simplified to improve reading and learning. Tutorials, references, and examples are constantly reviewed to avoid errors, but we cannot warrant full correctness of all content. While using W3Schools, you agree to have read and accepted our terms of use, cookies and privacy policy.

Copyright 1999-2026 by Refsnes Data. All Rights Reserved. W3Schools is Powered by W3.CSS.