Measuring Agreeableness Bias in Multimodal Models

This repository contains the code and resources for replicating the experiments and analysis from our paper Measuring agreeableness bias in Multimodal Models.

Introduction

This project investigates the phenomenon of "agreeableness bias" in multimodal language models - the tendency of these models to disproportionately favor visually presented information, even when it contradicts their prior knowledge. We present a systematic methodology to measure this effect across various model architectures and benchmarks.

Repository Structure

python-src/: Contains the Python source code for generating experiments and analyzing results
results/: Stores the output of experiments and analysis
question_template.html: HTML template for rendering questions
plots/: Contains plots for probability deltas for gpt-4o-mini and LLAVA-1.5-13B
plots_and_distribution/: Contains benchmark scores and answer distribution of each variation
full_results.csv: Comprehensive results in csv format

Setup

To set up the project environment:

Ensure you have Python 3.10 installed on your system.
Clone this repository:

git clone https://github.com/jasonlim131/looksRdeceiving.git
cd looksRdeceiving

Create a virtual environment, e.g. "agreeable_venv":
Activate this environment in your terminal:

source /path/to/env/agreeable_venv/venv/bin/activate

Install dependencies

pip3.10 install -r requirements.txt

If you keep getting dependency errors, try:

pip installing the missing packages
uninstalling and reinstalling with specific version written in the requirements.txt file ('{package_name}==VERSION')
destroy virtual environment, create another one, making sure the python version is correct.

Generating Experiments

To generate the experimental prompts:

For vMMLU:

python python-src/generate_vmmlu.py

This will create each rendered prompts in default directory output_directory/vmmlu_{variation}_rendered/question{i}.png

For vSocialIQa:

python python-src/generate_vsocialiqa.py

This will create rendered prompts in:

output_directory/vmmlu_centered_{variation}_rendered
output_directory/vmmlu_{variation}_rendered

Evaluating Models

python3.10 evaluate_{model}_{task_format}.py

Analyzing Results

python3.10 calculate_bias_{task_format}.py

Models Tested

We evaluated agreeableness bias in the following models:

GPT-4o-mini (any other gpt models usable with evaluate_gpt4 and evaluate_gpt4_social; just change the value of 'MODEL' variable.
Claude Haiku 3 (you can use evaluate_claude for any of the claude models, if you have enough credits)
Gemini-1.5-flash
LLAVA-1.5-13B (vMMLU only)

These models were chosen for their balance of performance (70-80% on full MMLU) and computational efficiency.

Benchmarks

Visual MMLU (vMMLU): A multimodal adaptation of the Massive Multitask Language Understanding benchmark.
Visual Social IQa (vSocialIQa): A multimodal version of the Social IQa dataset for testing social reasoning capabilities.

Contributing

We encourage further research in the following areas:

Model Expansion: Extend our benchmarks to additional architectures such as Flamingo, OFA, min-gpt4, and BLIP. Apply our analysis methodology to these models to broaden our understanding of agreeableness bias across different architectures.
Mechanistic Analysis: Conduct in-depth investigations of open-source models exhibiting visual bias. This could begin with embedding projections to understand the overlap in representation.

We welcome contributions to this project! Please see our Contributing Guidelines for more information.

License

This project is licensed under the MIT License - see the LICENSE.txt file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
distribution_and_score		distribution_and_score
output_directory		output_directory
plots		plots
plots_and_data_output		plots_and_data_output
python-src		python-src
results		results
vmmlu-env		vmmlu-env
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
data_organization_Template.csv		data_organization_Template.csv
dev-labels.lst		dev-labels.lst
dev.jsonl		dev.jsonl
full_results.csv		full_results.csv
full_results.rtf		full_results.rtf
full_results_2.txt		full_results_2.txt
question_template.html		question_template.html
question_template_size.html		question_template_size.html
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Measuring Agreeableness Bias in Multimodal Models

Table of Contents

Introduction

Repository Structure

Setup

Generating Experiments

Evaluating Models

Analyzing Results

Models Tested

Benchmarks

Contributing

License

About

Releases

Packages

Languages

License

jasonlim131/looksRdeceiving

Folders and files

Latest commit

History

Repository files navigation

Measuring Agreeableness Bias in Multimodal Models

Table of Contents

Introduction

Repository Structure

Setup

Generating Experiments

Evaluating Models

Analyzing Results

Models Tested

Benchmarks

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages