This project repository contains my research and implementations exploring various aspects of AI alignment. Below is an overview of the current contents of this repository.
- Description: This is my attempt at implementing the AI Safety Debate introduced in this paper.
- So far, I have experimented using GPT-4o and smaller open-source models. Only GPT-4o worked so far.
- Currently, I am building a web app to allow users to play around with the safety debate.
- Description: This is me experimenting with machine unlearning.
- So far, I have built a basic CNN to classify the images of the CIFAR10 dataset.
- My goal is to experiment with the base model. I hope to build a tool that can visualize the "lighting up" of different neural network connections given an input.
-
Clone this repository:
git clone https://github.com/ratch/alignment.git
-
Navigate to the desired project directory:
cd <experiment>
-
Follow the setup instructions in the respective project folders.
Feel free to reach out with questions or feedback!