MindGPT is a conversational system where users can ask mental health-orientated questions and receive answers which summarises content from two leading mental health websites: Mind and NHS Mental Health.
It's not a digital counsellor or therapist and the output from the system should not be treated as such, MindGPT is purely focused on acting as a gateway to vital information sources, summarising two authoritative perspectives and providing pointers to the original content.
In building this, we've drawn on our expertise in MLOps and prior experience in fine-tuning open-source LLMs for various tasks (see here for an example of one tuned to summarise legal text). If you're interested in how MindGPT works under the hood and what technologies and data we've used, then take a look here.
Mental health problems are something that everyone struggles with at various points in their life and finding the right type of help and information can be hard, even a blocker. Mind, one of the main mental health charities in the UK, puts this best: when you're living with a mental health problem, or support someone who is, having access to the right information is vital.
MindGPT sets out to increase ease of access to this information.
This project is in active development at Fuzzy Labs and you can follow along!
The repository for this project is one method where you can monitor progress - we're always raising pull requests to move the project forward. Another way you can do this is to follow our blog. We'll be posting periodic updates as we tick off sections of the project. If you're new to the project, the best place to start is our project introduction.
Before running the data scraping pipeline, an additional setup for creating a storage container on Azure is required to store the data. We use the DVC tool for data versioning and data management. Data version documentation guides you through the process of setting up a storage container on Azure and configuring DVC.
In this pipeline, there are two steps:
- Scrape data from Mind and NHS Mental Health websites
- Store and version the scraped data in a storage container on Azure using DVC
Now that you're all setup, let's run the data scraping pipeline.
python run.py --scrape
Now that we have data scraped, we're ready to prepare that data for the model. We've created a separate pipeline for this, where we clean, validate, and version the data.
We run the data preparation pipeline using the following command.
python run.py --prepare
To run the embedding pipeline, Azure Kubernetes Service (AKS) needs to be provisioned. We use AKS to run the Chroma (vector database) service.
matcha
tool can help you in provisioning these resources. Install matcha-ml
library and provision resources using matcha provision
command.
pip install matcha-ml
matcha provision
After the provisioning completes, we will have on hand these resources:
- Kubernetes cluster on Azure
- Seldon Core installed on this cluster
- Istio ingress installed on this cluster
Next, we apply Kubernetes manifests to deploy Chroma server on AKS using following commands
kubectl apply -f infrastructure/chroma_server_k8s
Port-forward the chroma server service to localhost using the following command. This will ensure we can access the server from localhost.
kubectl port-forward service/chroma-service 8000:8000
To run the monitoring service on k8s, matcha provision
must be run beforehand. We will need to build and push the metric service application to ACR. This image will be used by Kubernetes deployment. Before that, we need to set two bash variables, one for ACR registry URI and another for ACR registry name. We will use matcha get command to do this.
acr_registry_uri=$(matcha get container-registry registry-url --output json | sed -n 's/.*"registry-url": "\(.*\)".*/\1/p')
acr_registry_name=$(matcha get container-registry registry-name --output json | sed -n 's/.*"registry-name": "\(.*\)".*/\1/p')
Now we're ready to login into ACR, build and push the image to the ACR.
az acr login --name $acr_registry_name
docker build -t $acr_registry_uri/monitoring:latest -f monitoring/metric_service/Dockerfile .
docker push $acr_registry_uri/monitoring:latest
Line number 39 in monitoring-deployment.yaml should be updated to match the Docker image name which we've just pushed to the ACR, and it will need to be in the following format: <name-of-acr-registry>.azurecr.io/monitoring
.
Next, we apply the Kubernetes manifest to deploy the metric service and the metric database on AKS.
kubectl apply -f infrastructure/monitoring
Once kubectl
has finished applying the manifest, we should verify that the monitoring service is running. Running the commands below will give you an IP address for the service, which we can then curl
for a response:
kubectl get pods # Checking whether the monitoring pod is running
# Expected output (Note the name of pod will be different)
NAME READY STATUS RESTARTS AGE
monitoring-service-588f644c49-ldjhf 2/2 Running 0 3d1h
kubectl get svc monitoring-service -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
We should be able to curl the external IP returned running the above command at port 5000.
curl {external-ip:5000}
# The response should be:
Hello world from the metric service.
Now we know that everything is up and running, we need to use port-forwarding so the embedding pipeline (which is running locally) can communicate with the service hosted on Kubernetes:
kubectl port-forward service/monitoring-service 5000:5000
In data embedding pipeline, we take the validated dataset from data preparation pipeline and use Chroma vector database to store the embedding of the text data. This pipelines uses both the Mind and NHS data.
Finally, in a separate terminal we can run the data embedding pipeline.
python run.py --embed
Note: This pipelines might take somewhere between 5-10 mins to run.
To deploy a pre-trained LLM model, we first need a Kubernetes cluster with Seldon Core. matcha
tool can help you in provisioning the required resources. See the section above on how to set this up.
Apply the prepared kubernetes manifest to deploy the model:
kubectl apply -f infrastructure/llm_k8s/seldon-deployment.yaml
This will create a Seldon deployment, which consists of:
- Pod, that loads the pipeline and the model for inference
- Service and ingress routing, for accessing it from the outside
You can get ingress IP with matcha:
matcha get model-deployer base-url
Full URL to query the model:
http://<INGRESS_IP>/seldon/matcha-seldon-workloads/llm/v2/models/transformer/infer
The expected payload structure is as follows:
{
"inputs": [
{
"name": "array_inputs",
"shape": [-1],
"datatype": "string",
"data": "Some prompt text"
}
]
}
To run the monitoring service on your local machine, we'll utilise docker-compose. This will initialise two services - the metric service interface, which listens for POST and GET requests, and the metric database service.
To run docker-compose:
docker-compose -f monitoring/docker-compose.yml up
Once the two containers has started, we can curl our metric service from the outside.
curl localhost:5000/
# This should return a default message saying "Hello world from the metric service."
curl -X POST localhost:5000/readability -H "Content-Type: application/json" -d '{"response": "test_response"}'
# This should compute a readability score and insert the score into the "Readability" relation. We should also expect the following response message:
"{"message":"Readability data has been successfully inserted.","score":36.62,"status_code":200}
curl -X POST localhost:5000/embedding_drift -H "Content-Type: application/json" -d '{"reference_dataset": "1.1", "current_dataset": "1.2", "distance": 0.1, "drifted": true}'
# This should insert the embedding drift data to our "EmbeddingDrift" relation. If success, we should see the following response message:
"{"message":"Validation error: 'reference_dataset is not found in the data dictionary.'","status_code":400}"
# We can also query our database with:
curl localhost:5000/query_readability
We've created a notebook which accesses the monitoring service, fetches the metrics, and creates some simple plots showing the change over time.
This is a starting point for accessing the metrics, and we're planning to introduce a hosted dashboard version of these plots at some point in the future.
To deploy the Streamlit application on AKS, we first need to build a Docker image and then push it to ACR.
Note: Run the following command from the root of the project.
Verify that you are in the root of the project.
pwd
/home/username/MindGPT
We build and push the streamlit application to ACR. This image will be used by Kubernetes deployment. Before that, we need to set two bash variables, one for ACR registry URI and another for ACR registry name. We will use matcha get
command to do this.
acr_registry_uri=$(matcha get container-registry registry-url --output json | sed -n 's/.*"registry-url": "\(.*\)".*/\1/p')
acr_registry_name=$(matcha get container-registry registry-name --output json | sed -n 's/.*"registry-name": "\(.*\)".*/\1/p')
Now we're ready to login into ACR, build and push the image to the ACR.
az acr login --name $acr_registry_name
docker build -t $acr_registry_uri/mindgpt:latest -f app/Dockerfile .
docker push $acr_registry_uri/mindgpt:latest
Line number 19 in streamlit-deployment.yaml should be updated to match the Docker image name which we've just pushed to the ACR, and it will need to be in the following format: <name-of-acr-registry>.azurecr.io/mindgpt
.
Next, we apply the Kubernetes manifest to deploy the streamlit application on AKS.
kubectl apply -f infrastructure/streamlit_k8s
Finally, we verify the streamlit application. The command below should provide an IP address for the streamlit application.
kubectl get service streamlit-service -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
If you visit that URL in browser, you should be able to interact with the deployed streamlit application.
This project wouldn't be possible without the exceptional content on both the Mind and NHS Mental Health websites.
This project is released under the Apache License. See LICENSE.