AWS Mitigating Bias

Mitigating Bias

AI models trained on biased data will likely reproduce those biases. Bias can appear in prompt engineering in two ways:

Biased prompts: If prompts are built on assumptions (e.g., assuming all software developers are men), the AI produces biased results
Biased models: Even with neutral prompts, AI models may produce biased results due to training data bias

Insufficient training data can also create bias. Low confidence models are often deprioritized by toxicity filters and ranking algorithms, leading to exclusion of underrepresented groups. This creates a cycle:

Uneven group representation in life
Lack of sufficient training data
Models inherently prefer more data
Bias continues from data into the model
Models are deployed into applications
Applications enforce the bias they were trained on (cycles back)

Mitigating bias techniques

Three techniques to help mitigate bias in FMs:

Update the prompt: Explicit guidance reduces inadvertent bias at scale
Enhance the dataset: Provide different pronouns and add diverse examples
Use training techniques: Apply fair loss functions, red teaming, RLHF, and more

Update the prompt

Provide explicit guidance to reduce bias at scale. Text-to-image models often generate images with specific skin color and gender stereotypes. For example, prompting "An image of a florist" may produce an image reflecting gender and race stereotypes.

You can use a few methods to mitigate bias in a model's output:

TIED Framework

The Text-to-Image Disambiguation (TIED) framework avoids ambiguity in prompts by asking clarifying questions:

Initial prompt:

The girl looks at the bird and the butterfly; it is green

Model asks clarifying question:

Is the bird green?

Yes, the bird is green

Disambiguated prompt:

The girl looks at the bird and the butterfly; it is green. The bird is green.

Disambiguated prompts are more likely to mitigate bias in model output by being explicit about characteristics rather than letting the model assume defaults.

Text-to-image Ambiguity Benchmark (TAB)

TAB provides a schema in the prompt to ask clarifying questions with various options:

Sentence	Options	Questions to ask
An image of a florist	the florist is a female; the florist is a male; the florist has dark skin color; the florist has light skin color; the florist is young; the florist is old	is the florist a female; is the florist a male; does the florist have dark skin color; does the florist have light skin color; is the florist young; is the florist old

Clarify using few-shot learning

You can have the model generate clarifying questions using few-shot learning. Give the model context and example questions:

Context: The boy sits next to the basket with a cat.
Question: Is the cat in the basket?

Context: The girl observes the boy standing next to the fireplace.
Question: Is the girl standing next to the fireplace?

Enhance the dataset

Mitigate bias by enhancing training datasets with different pronouns and diverse examples.

For LLMs trained on text, use counterfactual data augmentation, expanding training sets with modified data:

Before	After
After a close reading, Dr. John Stiles was convinced. He diagnosed the disease quickly.	After a close reading, Dr. Akua Mansa was convinced. She diagnosed the disease quickly.
CEO and founder Richard Roe closed his last funding round with a goal of tripling the business.	CEO and founder Sofía Martínez closed her last funding round with a goal of tripling the business.
Nurse Mary Major cleaned up the patient's living quarters, then she took out the dirty dishes.	Nurse Mateo Jackson cleaned up the patient's living quarters, then he took out the dirty dishes.

For LLMs trained on images, counterfactual data augmentation involves three steps:

Detect: Use image classification to detect people, objects, and backgrounds; compute summary statistics to detect imbalances
Segment: Use segmentation to generate pixel maps of objects to replace
Augment: Use image-to-image techniques to update images and equalize distributions

Use training techniques

Two techniques at the training level help mitigate bias:

Equalized odds to measure fairness

Equalized odds equalizes the error a model makes when predicting categorical outcomes for different groups.

Model Error Rates: False Negative Rate (FNR) and False Positive Rate (FPR)
Goal: Match True Positive Rate (TPR) and FPR for different groups

Using fairness criterion as model objectives

You can optimize model training for performance as the singular objective, or use combined objectives including:

Fairness
Energy efficiency
Inference time

❮ Previous Next ❯

★ +1

AWS GenAI

AWS Prompt Engineering

More AWS

AWS Mitigating Bias

Mitigating Bias

Mitigating bias techniques

Update the prompt

TIED Framework

Text-to-image Ambiguity Benchmark (TAB)

Clarify using few-shot learning

Enhance the dataset

Use training techniques

Equalized odds to measure fairness

Using fairness criterion as model objectives

COLOR PICKER

Contact Sales

Report Error

Top Tutorials

Top References

Top Examples

Get Certified