Skip to content

Commit

Permalink
Merge pull request #7 from ruivieira/odhv2
Browse files Browse the repository at this point in the history
Add ODHv2 guide
  • Loading branch information
ruivieira authored Nov 3, 2023
2 parents 920b3e4 + c267fbf commit ac5a093
Show file tree
Hide file tree
Showing 12 changed files with 114 additions and 137 deletions.
Binary file added Writerside/images/odh_V2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions Writerside/redirection-rules.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,8 @@
<description>Created after removal of "Metrics API" from TrustyAI</description>
<accepts>Metrics-API.html</accepts>
</rule>
<rule id="3ab6c883">
<description>Created after removal of "Starter" from TrustyAI</description>
<accepts>Starter.html</accepts>
</rule>
</rules>
5 changes: 3 additions & 2 deletions Writerside/t.tree
Original file line number Diff line number Diff line change
Expand Up @@ -21,19 +21,20 @@
<toc-element topic="TrustyAI-operator.md">
</toc-element>
<toc-element topic="TrustyAI-service.md">
<toc-element topic="Starter.md"/>
<toc-element topic="How-to-schedule-a-metric.md"/>
</toc-element>
</toc-element>
<toc-element topic="How-to.md">
<toc-element topic="Install-on-Open-Data-Hub.md"/>
<toc-element topic="Installing-on-Kubernetes.md"/>
<toc-element topic="How-to-schedule-a-metric.md"/>
</toc-element>
<toc-element topic="Tutorial.md">
<toc-element topic="Bias-Monitoring-via-TrustyAI-in-ODH.md"/>
</toc-element>
<toc-element topic="Reference.md">
<toc-element topic="TrustyAI-service-API.md">
<toc-element topic="REST-API-reference.md"/>
<toc-element topic="Configuration.md"/>
</toc-element>
</toc-element>
</instance-profile>
8 changes: 4 additions & 4 deletions Writerside/topics/Bias-Monitoring-via-TrustyAI-in-ODH.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,16 +26,16 @@ use the following information about the applicant to make their prediction:
* Length of Employment (in days)

What we want to verify is that neither of our models are not biased over the gender field of `Is Male-Identifying?`. To do this,
we will monitor our models with *Statistical Parity Difference (SPD)* metric, which will tell us how the difference betweem how often
male-identifying and non-male-identifying applicants are given favorable predictions (i.e., they are predicted
we will monitor our models with [Statistical Parity Difference](Statistical-Parity-Difference.md) (SPD)* metric, which will tell us how the difference betweem how often
male-identifying and non-male-identifying applicants are given favorable predictions (_i.e._, they are predicted
to pay back their loans). Ideally, the SPD value would be 0, indicating that both groups have equal likelihood of getting a good outcome. However, an SPD value between -0.1 and 0.1 is also indicative of reasonable fairness,
indicating that the two groups' rates of getting good outcomes only varies by +/-10%.


## Setup
Follow the instructions within the [installation section](Install-on-Open-Data-Hub.md).
Afterwards, you should have an ODH installation, a TrustyAI Operator, and a `model-namespace` project containing
an instance of the TrustyAI Service.
Afterwards, you should have an ODH installation, a [TrustyAI operator](TrustyAI-operator.md), and a `model-namespace` project containing
an instance of the [TrustyAI service](TrustyAI-service.md).

## Deploy Models
1) Navigate to the `model-namespace` created in the setup section: `oc project model-namespace`
Expand Down
46 changes: 46 additions & 0 deletions Writerside/topics/Configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Configuration

## Data sources

### Metrics data

Storage backend adapters implement the `Storage` interface which has the responsibility
of reading the data from a specific storage type (flat file on PVC, S3, database, _etc_)
and return the inputs and outputs as `ByteBuffer`.
From there, the service converts the `ByteBuffer` into a TrustyAI `Dataframe` to be used
in the metrics calculations.

The type of datasource is passed with the environment variable `SERVICE_STORAGE_FORMAT`.

The supported data sources are:

| Type | Storage property |
|-------------------------------------------|------------------|
| MinIO | `MINIO` |
| Kubernetes Persistent Volume Claims (PVC) | `PVC` |
| Memory | `MEMORY` |

The data can be batched into the latest `n` observations by using the configuration key
`SERVICE_BATCH_SIZE=n`. This behaves like a `n`-size tail and its optional.
If not specified, the entire dataset is used.

## Caching

The configuration variables include:

| Environment variable | Values | Default | Purpose |
|-------------------------|----------------|---------|---------------------------------------------------------------------------|
| `QUARKUS_CACHE_ENABLED` | `true`/`false` | `true` | Enables data fetching and metric calculation caching. Enabled by default. |


## Kubernetes and OpenShift Deployment

To deploy in Kubernetes or OpenShift, the connection information can be passed into the manifest using the `ConfigMap`.
For more information, see the [Kubernetes](Installing-on-Kubernetes.md) and [OpenShift](Install-on-Open-Data-Hub.md) installation guides.

<seealso style="links">
<category ref="related">
<a href="Install-on-Open-Data-Hub.md">Installing on Open Data Hub</a>
<a href="Installing-on-Kubernetes.md">Installing on Kubernetes</a>
</category>
</seealso>
2 changes: 1 addition & 1 deletion Writerside/topics/How-to-schedule-a-metric.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# How to schedule a metric
# Schedule a metric

A How-to article is an action-oriented type of document.
It explains how to perform a specific task or solve a problem, and usually contains a sequence of steps.
Expand Down
61 changes: 48 additions & 13 deletions Writerside/topics/Install-on-Open-Data-Hub.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ blank cluster, you will be left with:
<p>Make sure you are <code>oc login</code>'d to your OpenShift cluster</p>
</step>
<step>
<p>Create two projects, <code>$ODH</code> and <code>$PROJECT</code>.<p>
<p>Create two projects, <code>$ODH</code> and <code>$PROJECT</code>.</p>
<p>These names are arbitrary, but I'll be using them throughout the rest of this demo</p>
<code-block lang="shell">
oc create project $ODH
Expand All @@ -37,7 +37,7 @@ blank cluster, you will be left with:
</step>
</procedure>

To get enable ODH's monitoring stack, user-workload-monitoring must be configured.
To get enable ODH's monitoring stack, <code>user-workload-monitoring</code> must be configured.

<procedure title="Enable User-Workload-Monitoring" id="enable-user-workload-monitoring">
<step>
Expand All @@ -59,11 +59,11 @@ your cluster management UI (for example, on console.redhat.com)

<procedure title="Install ODH Operator" id="install-odh-operator">
<step>
<p>From the OpenShift Console, navigate to "Operators" -> "OperatorHub", and search for "Open Data Hub"</p>
<p>From the OpenShift Console, navigate to <ui-path>Operators | OperatorHub</ui-path>, and search for <ui-path>Open Data Hub</ui-path></p>
<img src="odh_operator_install.png" alt="ODH in OperatorHub" border-effect="line"/>
</step>
<step>
<p>Click on "Open Data Hub Operator"</p>
<p>Click on <ui-path>Open Data Hub Operator</ui-path></p>
<list>
<li>If the "Show community Operator" warning opens, hit "Continue"</li>
<li>Hit "Install"</li>
Expand All @@ -81,6 +81,48 @@ your cluster management UI (for example, on console.redhat.com)
</step>
</procedure>

## ODH v2

<note>
<p>Since ODH 2.3.0, TrustyAI is included as an ODH component.</p>
<p>For versions prior to 2.3.0, use the ODH v1 method.</p>
</note>

If the provided ODH version in your cluster's OperatorHub is version 2.3.0+, use the following steps:

### Install ODH (ODH v2.x)

<procedure title="Install ODH v2" id="install-odh-v2">
<step>
<p>Navigate to your <code>opendatahub</code> project</p>
</step>
<step>
<p>From "Installed Operators", select "Open Data Hub Operator"</p>
</step>
<step>
<p>Navigate to the "Data Science Cluster" tab and hit "Create DataScienceCluster"</p>
</step>
<step>
<p> In the YAML view Make sure <code>trustyai</code> is set to <code>Managed</code></p>
<img src="odh_V2.png" border-effect="line"/>
</step>
<step>
<p>Hit the "Create" button</p>
</step>
<step>
<p>Within the "Pods" menu, you should begin to see various ODH components being created, including the <code>trustyai-service-operator-controller-manager-xxx</code></p>
</step>
</procedure>

### Install a TrustyAI service

<procedure title="Install a TrustyAI service" id="install-trustyai-service">
<step>Navigate to your <code>model-namespace</code> project: <code>oc project model-namespace</code></step>
<step>Run <code>oc apply -f resources/trustyai_crd.yaml</code>. This will install the TrustyAI Service
into your `model-namespace` project, which will then provide TrustyAI features to all subsequent models deployed into that project, such as explainability, fairness monitoring, and data drift monitoring.
</step>
</procedure>

## ODH v1

<note>
Expand All @@ -92,10 +134,9 @@ your cluster management UI (for example, on console.redhat.com)
<p>Navigate to your <code>$ODH</code> project</p>
</step>
<step>
<p>From "Installed Operators", select "Open Data Hub Operator"</p>
<p>Go to <ui-path>Installed Operators | Open Data Hub Operator | KfDef</ui-path></p>
</step>
<step>
<p>Navigate to the "Kf Def" tab</p>
<list>
<li>Hit "Create KfDef"</li>
<li>Hit "Create" without making any changes to the default configuration</li>
Expand Down Expand Up @@ -126,10 +167,4 @@ your cluster management UI (for example, on console.redhat.com)
into your <code>$PROJECT</code> project, which will then provide TrustyAI features to all subsequent models deployed into
that project, such as explainability, fairness monitoring, and data drift monitoring</p>
</step>
</procedure>

## ODH v2.x

If the provided ODH version in your cluster's OperatorHub is version 2.x, use the following steps:

(todo)
</procedure>
3 changes: 3 additions & 0 deletions Writerside/topics/REST-API-reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# REST API reference

<api-doc openapi-path="./../specs/openapi.yml"></api-doc>
78 changes: 0 additions & 78 deletions Writerside/topics/Starter.md

This file was deleted.

2 changes: 2 additions & 0 deletions Writerside/topics/Statistical-Parity-Difference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Statistical Parity Difference

3 changes: 2 additions & 1 deletion Writerside/topics/TrustyAI-service-API.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# TrustyAI service

<api-doc openapi-path="./../specs/openapi.yml"></api-doc>
* Reference for the [REST API](REST-API-reference.md)
* Service [configuration](Configuration.md)
39 changes: 1 addition & 38 deletions Writerside/topics/TrustyAI-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -700,47 +700,10 @@ curl -X POST --location "http://localhost:8080/q/info" \
}'
```

# Data sources

## Metrics data

Storage backend adapters implement the `Storage` interface which has the responsibility
of reading the data from a specific storage type (flat file on PVC, S3, database, _etc_)
and return the inputs and outputs as `ByteBuffer`.
From there, the service converts the `ByteBuffer` into a TrustyAI `Dataframe` to be used
in the metrics calculations.

The type of datasource is passed with the environment variable `SERVICE_STORAGE_FORMAT`.

The supported data sources are:

| Type | Storage property |
|-------------------------------------------|------------------|
| MinIO | `MINIO` |
| Kubernetes Persistent Volume Claims (PVC) | `PVC` |
| Memory | `MEMORY` |

The data can be batched into the latest `n` observations by using the configuration key
`SERVICE_BATCH_SIZE=n`. This behaves like a `n`-size tail and its optional.
If not specified, the entire dataset is used.

# Deployment

## OpenShift

To deploy in Kubernetes or OpenShift, the connection information
can be passed into the manifest using the `ConfigMap` in [here](manifests/opendatahub/base/trustyai-configmap.yaml).

The main manifest is available [here](manifests/opendatahub/default/trustyai-deployment.yaml).

The configuration variables include:

| Environment variable | Values | Default | Purpose |
|-------------------------|----------------|---------|---------------------------------------------------------------------------|
| `QUARKUS_CACHE_ENABLED` | `true`/`false` | `true` | Enables data fetching and metric calculation caching. Enabled by default. |

<seealso style="links">
<category ref="api">
<a href="TrustyAI-service-API.md">TrustyAI service REST API</a>
<a href="Configuration.md">TrustyAI service configuration</a>
</category>
</seealso>

0 comments on commit ac5a093

Please sign in to comment.