diff --git a/docs/index.md b/docs/index.md index d77a969..e0ef8ea 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,375 +1,121 @@ # LUME-model -LUME-model holds data structures used in the LUME modeling toolset. Variables and models built using LUME-model will be compatible with other tools. LUME-model uses [pydantic](https://pydantic-docs.helpmanual.io/) models to enforce typed attributes upon instantiation. +LUME-model holds data structures used in the LUME modeling toolset. Variables and models built using LUME-model will be compatible with other tools. LUME-model uses [Pydantic](https://pydantic-docs.helpmanual.io/) models to enforce typed attributes upon assignment. -## Requirements - -* Python >= 3.7 -* pydantic -* numpy - -## Install +## Installing LUME-model LUME-model can be installed with conda using the command: - -``` $ conda install lume-model -c conda-forge ``` - -## Developer - -A development environment may be created using the packaged `dev-environment.yml` file. - -``` -conda env create -f dev-environment.yml +```shell +conda install lume-model -c conda-forge ``` ## Variables -The lume-model variables are intended to enforce requirements for input and output variables by variable type. Current variable implementations are scalar (float) or image (numpy array) type. +The lume-model variables are intended to enforce requirements for input and output variables by variable type. For now, only scalar variables (floats) are supported. -Example of minimal implementation of scalar input and output variables: +Minimal example of scalar input and output variables: ```python from lume_model.variables import ScalarInputVariable, ScalarOutputVariable -input_variable = ScalarInputVariable(name="test_input", default=0.1, value_range=[1, 2]) -output_variable = ScalarOutputVariable(name="test_output") -``` - -Example of minimal implementation of image input and output variables: - -```python -from lume_model.variables import ImageInputVariable, ImageOutputVariable -import numpy as np - -input_variable = ImageInputVariable( - name="test_input", - default= np.array([[1, 2,], [3, 4]]), - value_range=[1, 10], - axis_labels=["count_1", "count_2"], - x_min=0, - y_min=0, - x_max=5, - y_max=5, -) - -output_variable = ImageOutputVariable( - name="test_output", - axis_labels=["count_1", "count_2"], +input_variable = ScalarInputVariable( + name="example_input", + default=0.1, + value_range=[0.0, 1.0], ) +output_variable = ScalarOutputVariable(name="example_output") ``` -All input variables may be made into constants by passing the `is_constant=True` keyword argument. Value assingments on these constant variables will raise an error message. - ## Models -LUME-model model classes are intended to guide user development while allowing for flexibility and customizability. The base class `lume_model.models.BaseModel` is used to enforce LUME tool compatable classes for the execution of trained models. For this case, model loading and execution should be organized into class methods. +The lume-model base class `lume_model.base.LUMEBaseModel` is intended to guide user development while allowing for flexibility and customizability. It is used to enforce LUME tool compatible classes for the execution of trained models. -Model Requirements: +Requirements for model classes: -* input_variables, output_variables: lume-model input and output variables are required for use with lume-epics tools. The user can optionally define these as class attributes or design the subclass so that these are passed during initialization . Names of all variables must be unique in order to be served using the EPICS tools. A utility function for saving these variables, which also enforces the uniqueness constraint, is provided (lume_model.utils.save_variables). -* evaluate: The evaluate method is called by the serving model. Subclasses must implement the method, accepting a list of input variables and returning a list of the model's output variables with value attributes updated based on model execution. +* input_variables: A list defining the input variables for the model. Variable names must be unique. Required for use with lume-epics tools. +* output_variables: A list defining the output variables for the model. Variable names must be unique. Required for use with lume-epics tools. +* evaluate: The evaluate method is called by the serving model. Subclasses must implement this method, accepting and returning a dictionary. -Example model implementation: +Example model implementation and instantiation: ```python -from lume_model.models import BaseModel - -class ExampleModel(BaseModel): - input_variables = { - "input1": ScalarInputVariable(name="input1", default=1, range=[0.0, 5.0]), - "input2": ScalarInputVariable(name="input2", default=2, range=[0.0, 5.0]), - } +from lume_model.base import LUMEBaseModel +from lume_model.variables import ScalarInputVariable, ScalarOutputVariable - output_variables = { - "output1": ScalarOutputVariable(name="output1"), - "output2": ScalarOutputVariable(name="output2"), - } - def evaluate(self, input_variables): +class ExampleModel(LUMEBaseModel): + def evaluate(self, input_dict): + output_dict = { + "output1": input_dict[self.input_variables[0].name] ** 2, + "output2": input_dict[self.input_variables[1].name] ** 2, + } + return output_dict - self.input_variables = input_variables - self.output_variables["output1"].value = ( - self.input_variables["input1"].value * 2 - ) - self.output_variables["output2"].value = ( - self.input_variables["input2"].value * 2 - ) +input_variables = [ + ScalarInputVariable(name="input1", default=0.1, value_range=[0.0, 1.0]), + ScalarInputVariable(name="input2", default=0.2, value_range=[0.0, 1.0]), +] +output_variables = [ + ScalarOutputVariable(name="output1"), + ScalarOutputVariable(name="output2"), +] - # return inputs * 2 - return self.output_variables.values() +m = ExampleModel(input_variables=input_variables, output_variables=output_variables) ``` -Variables can be loaded from a yaml file formatted as below: - -`my_variables.yml` +Models and variables can be saved and loaded from YAML files, e.g. `m.dump("example_model.yml")` writes the following to file ```yaml +model_class: ExampleModel input_variables: input1: - name: input1 - type: scalar - default: 1 - range: [0, 256] - + variable_type: scalar + default: 0.1 + is_constant: false + value_range: [0.0, 1.0] input2: - name: input2 - type: scalar - default: 2.0 - range: [0, 256] - + variable_type: scalar + default: 0.2 + is_constant: false + value_range: [0.0, 1.0] output_variables: - output1: - name: output1 - type: image - x_label: "value1" - y_label: "value2" - axis_units: ["mm", "mm"] - x_min: 0 - x_max: 10 - y_min: 0 - y_max: 10 - - output2: - name: output2 - type: scalar - - output3: - name: output3 - type: scalar -``` - -And subsequently loaded using: - -```python -from lume_model.utils import variables_from_yaml - -with open("my_variables.yml", "r") as f: - input_variables, output_variables = variables_from_yaml(f) -``` - -## Configuration files - -Models and variables may be constructed using a yaml configuration file. The configuration file consists of three sections: - -* model (optional, can alternatively pass a custom model class into the `model_from_yaml` method) -* input_variables -* output_variables - -The model section is used for the initialization of model classes. The `model_class` entry is used to specify the model class to initialize. The `model_from_yaml` method will attempt to import the specified class. Additional model-specific requirements may be provided. These requirements will be checked before model construction. Model keyword arguments may be passed via the config file or with the function kwarg `model_kwargs`. All models are assumed to accept `input_variables` and `output_variables` as keyword arguments. - -The below example outlines the specification for a model compatible with the `lume-model` keras/tensorflow toolkit. - -```yaml -model: - model_class: lume_model.keras.KerasModel - requirements: - tensorflow: 2.3.1 - args: - model_file: examples/files/iris_model.h5 - output_format: - type: softmax - -``` - -Variables are constructed the minimal data requirements for inputs/outputs. - -An example ScalarInputVariable: - -```yaml -input_variables: - SepalLength: - name: SepalLength - type: scalar - default: 4.3 - lower: 4.3 - upper: 7.9 - + output1: {variable_type: scalar} + output2: {variable_type: scalar} ``` -For image variables, default values must point to files associated with a default numpy array representation. The file import will be relative to PYTHONPATH. - -An example ImageInputVariable: - -```yaml -input_variables: - InputImage: - name: test - type: image - default: examples/files/example_input_image.npy - range: [0, 100] - x_min: 0 - x_max: 10 - y_min: 0 - y_max: 10 - axis_labels: ["x", "y"] - x_min_variable: xmin_pv - y_min_variable: ymin_pv - x_max_variable: xmax_pv - y_max_variable: ymax_pv - -``` - -## Keras/tensorflow toolkit - -At present, only the tensorflow v2 backend is supported for this toolkit. - -The `KerasModel` packaged in the toolkit will be compatible with models saved using the `keras.save_model()` method. - -### Development requirements - -* The model must be trained using the custom scaling layers provided in `lume_model.keras.layers` OR using preprocessing layers packaged with Keras OR the custom layers must be defined during build and made accessible during loading by the user. Custom layers are not supported out-of-the box by this toolkit. - -* The keras model must use named input layers such that the model will accept a dictionary input OR the `KerasModel` must be subclassed and the `format_input` and `format_output` member functions must be overwritten with proper formatting of model input from a dictionary mapping input variable names to values and proper output parsing into a dictionary, respectively. This will require use of the Keras functional API for model construction. - -An example of a model built using the functional API is given below: +and can be loaded by simply passing the file to the model constructor: ```python -from tensorflow import keras -import tensorflow as tf - -sepal_length_input = keras.Input(shape=(1,), name="SepalLength") -sepal_width_input = keras.Input(shape=(1,), name="SepalWidth") -petal_length_input = keras.Input(shape=(1,), name="PetalLength") -petal_width_input = keras.Input(shape=(1,), name="PetalWidth") -inputs = [sepal_length_input, sepal_width_input, petal_length_input, petal_width_input] -merged = keras.layers.concatenate(inputs) -dense1 = Dense(8, activation='relu')(merged) -output = Dense(3, activation='softmax', name="Species")(dense1) - -# Compile model -model = keras.Model(inputs=inputs, outputs=[output]) -optimizer = tf.keras.optimizers.Adam() -model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy']) - -``` - -Models built in this way will accept inputs in dictionary form mapping variable name to a numpy array of values. +from lume_model.base import LUMEBaseModel -### Configuration file -The KerasModel can be instantiated using the utility function `lume_model.utils.model_from_yaml` method. +class ExampleModel(LUMEBaseModel): + def evaluate(self, input_dict): + output_dict = { + "output1": input_dict[self.input_variables[0].name] ** 2, + "output2": input_dict[self.input_variables[1].name] ** 2, + } + return output_dict -KerasModel can be specified in the `model_class` of the model configuration. -```yaml -model: - model_class: lume_model.keras.KerasModel +m = ExampleModel("example_model.yml") ``` -Custom parsing will require a custom model class. - -## PyTorch Toolkit - -In the same way as the KerasModel, the PyTorchModel can also be loaded using the `lume_model.utils.model_from_yaml` method, specifying `PyTorchModel` in the `model_class` of the configuration file. - -```yaml -model: - kwargs: - model_file: tests/test_files/california_regression/california_regression.pt - model_class: lume_model.torch.PyTorchModel - model_info: tests/test_files/california_regression/model_info.json - output_format: - type: tensor - requirements: - torch: 1.12 -``` - -In addition to the model_class, we also specify the path to the pytorch model (saved using `torch.save()`) and additional information about the model through the `model_info.json` file such as the order of the feature names and outputs of the model: +## Developer -```json -{ - "train_input_mins": [ - 0.4999000132083893, - ... - -124.3499984741211 - ], - "train_input_maxs": [ - 15.000100135803223, - ... - -114.30999755859375 - ], - "model_in_list": [ - "MedInc", - ... - "Longitude" - ], - "model_out_list": [ - "MedHouseVal" - ], - "loc_in": { - "MedInc": 0, - ... - "Longitude": 7 - }, - "loc_out": { - "MedHouseVal": 0 - } -} +Clone this repository: +```shell +git clone https://github.com/slaclab/lume-model.git ``` -The `output_format` specification indicates which form the outputs of the model's `evaluate()` function should take, which may vary depending on the application. PyTorchModels working with the [LUME-EPICS](https://github.com/slaclab/lume-epics) service will require an `OutputVariable` type, while [Xopt](https://github.com/ChristopherMayes/Xopt) requires either a dictionary of float values or tensors as output. - -It is important to note that currently the **transformers are not loaded** into the model when using the `model_from_yaml` method. These need to be created separately and added either: - -* to the model's `kwargs` before instantiating - -```python -import torch -import json -from lume_model.torch import PyTorchModel - -# load the model class and kwargs -with open(f"california_variables.yml","r") as f: - yaml_model, yaml_kwargs = model_from_yaml(f, load_model=False) - -# construct the transformers -with open("normalization.json", "r") as f: - normalizations = json.load(f) - -input_transformer = AffineInputTransform( - len(normalizations["x_mean"]), - coefficient=torch.tensor(normalizations["x_scale"]), - offset=torch.tensor(normalizations["x_mean"]), -) -output_transformer = AffineInputTransform( - len(normalizations["y_mean"]), - coefficient=torch.tensor(normalizations["y_scale"]), - offset=torch.tensor(normalizations["y_mean"]), -) - -model_kwargs["input_transformers"] = [input_transformer] -model_kwargs["output_transformers"] = [output_transformer] - -model = PyTorchModel(**model_kwargs) +Create an environment lume-model-dev with all the dependencies: +```shell +conda env create -f dev-environment.yml ``` -* using the setters for the transformer attributes in the model. - -```python -# load the model -with open("california_variables.yml", "r") as f: - model = model_from_yaml(f, load_model=True) - -# construct the transformers -with open("normalization.json", "r") as f: - normalizations = json.load(f) - -input_transformer = AffineInputTransform( - len(normalizations["x_mean"]), - coefficient=torch.tensor(normalizations["x_scale"]), - offset=torch.tensor(normalizations["x_mean"]), -) -output_transformer = AffineInputTransform( - len(normalizations["y_mean"]), - coefficient=torch.tensor(normalizations["y_scale"]), - offset=torch.tensor(normalizations["y_mean"]), -) - -# use the model's setter to add the transformers. Here we use a tuple -# to tell the setter where in the list the transformer should be inserted. -# In this case because we only have one, we add them at the beginning -# of the lists. -model.input_transformers = (input_transformer, 0) -model.output_transformers = (output_transformer, 0) +Install as editable: +```shell +conda activate lume-model-dev +pip install --no-dependencies -e . ``` diff --git a/docs/models.md b/docs/models.md index 26f0f94..c66eb0a 100644 --- a/docs/models.md +++ b/docs/models.md @@ -1,8 +1,21 @@ # Models -::: lume_model.models - selection: +::: lume_model.base + options: members: - - BaseModel - rendering: - show_root_heading: false + - LUMEBaseModel + +::: lume_model.models.torch_model + options: + members: + - TorchModel + +::: lume_model.models.torch_module + options: + members: + - TorchModule + +::: lume_model.models.keras_model + options: + members: + - KerasModel diff --git a/docs/utils.md b/docs/utils.md index cd1f300..c34c852 100644 --- a/docs/utils.md +++ b/docs/utils.md @@ -1,5 +1,8 @@ # Utilities ::: lume_model.utils - rendering: - show_root_heading: false + options: + members: + - variables_as_yaml + - variables_from_dict + - variables_from_yaml diff --git a/docs/variables.md b/docs/variables.md index 4eebf41..03920a2 100644 --- a/docs/variables.md +++ b/docs/variables.md @@ -1,14 +1,11 @@ # Variables ::: lume_model.variables - selection: + options: members: + - Variable + - ScalarVariable + - InputVariable + - OutputVariable - ScalarInputVariable - ScalarOutputVariable - - ImageInputVariable - - ImageOutputVariable - - ArrayInputVariable - - ArrayOutputVariable - - TableVariable - rendering: - show_source: true diff --git a/examples/IrisTraining.ipynb b/examples/IrisTraining.ipynb deleted file mode 100644 index 2f69436..0000000 --- a/examples/IrisTraining.ipynb +++ /dev/null @@ -1,198 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# iris example" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.datasets import load_iris\n", - "from tensorflow import keras\n", - "import tensorflow as tf\n", - "\n", - "from tensorflow.keras.models import Sequential\n", - "from tensorflow.keras.layers import Dense, Flatten\n", - "from tensorflow.keras.utils import to_categorical\n", - "from sklearn.preprocessing import LabelEncoder\n", - "import pandas as pd\n", - "iris = load_iris()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "iris[\"data\"][0].shape" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "data = pd.DataFrame(iris.data, columns=iris.feature_names)\n", - "data.columns = [\"SepalLength\", \"SepalWidth\", \"PetalLength\", \"PetalWidth\"]\n", - "\n", - "data[\"Species\"] = iris.target\n", - "data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_dataset = data.sample(frac=0.8,random_state=0)\n", - "test_dataset = data.drop(train_dataset.index)\n", - "train_labels = train_dataset.pop('Species')\n", - "test_labels = test_dataset.pop('Species')\n", - "train_dataset.keys()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "scrolled": true - }, - "outputs": [], - "source": [ - "# encode class values as integers\n", - "encoder = LabelEncoder()\n", - "encoder.fit(train_labels)\n", - "encoded_Y = encoder.transform(train_labels)\n", - "\n", - "# convert integers to dummy variables (i.e. one hot encoded)\n", - "dummy_y = to_categorical(encoded_Y)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "scrolled": true - }, - "outputs": [], - "source": [ - " # define model\n", - "def build_model():\n", - " # create model\n", - " sepal_length_input = keras.Input(shape=(1,), name=\"SepalLength\")\n", - " sepal_width_input = keras.Input(shape=(1,), name=\"SepalWidth\")\n", - " petal_length_input = keras.Input(shape=(1,), name=\"PetalLength\")\n", - " petal_width_input = keras.Input(shape=(1,), name=\"PetalWidth\")\n", - " inputs = [sepal_length_input, sepal_width_input, petal_length_input, petal_width_input]\n", - " merged = keras.layers.concatenate(inputs)\n", - " dense1 = Dense(8, activation='relu')(merged)\n", - " output = Dense(3, activation='softmax', name=\"Species\")(dense1)\n", - "\n", - " # Compile model\n", - " model = keras.Model(inputs=inputs, outputs=[output])\n", - " optimizer = tf.keras.optimizers.Adam()\n", - " model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])\n", - " return model\n", - "\n", - "model = build_model()\n", - "keras.utils.plot_model(model, \"my_first_model_with_shape_info.png\", show_shapes=True)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_stats = train_dataset.describe()\n", - "train_stats = train_stats.transpose()\n", - "train_stats" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_x = train_dataset.to_dict(\"series\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=15)\n", - "\n", - "history = model.fit(train_x, dummy_y, epochs=1000,\n", - " validation_split = 0.2, verbose=1, callbacks=[early_stop])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "model.save(\"files/iris_model.h5\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "model.input_names" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "model.output_names" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.9" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/examples/__init__.py b/examples/__init__.py deleted file mode 100644 index e69de29..0000000 diff --git a/examples/custom_model.ipynb b/examples/custom_model.ipynb new file mode 100644 index 0000000..a5258e9 --- /dev/null +++ b/examples/custom_model.ipynb @@ -0,0 +1,131 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "235c92cd-cc05-42b8-a516-1185eeac5f0c", + "metadata": {}, + "source": [ + "# Creating a Custom LUME-model\n", + "Custom models that are compatible with LUME tools can be created by inhereting from the `LUMEBaseModel`." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "56725817-2b21-4bea-98b0-151dea959f77", + "metadata": {}, + "outputs": [], + "source": [ + "from lume_model.base import LUMEBaseModel\n", + "from lume_model.variables import ScalarInputVariable, ScalarOutputVariable" + ] + }, + { + "cell_type": "markdown", + "id": "79c62b18-7dc1-44ca-b578-4dea5cc4a4b4", + "metadata": {}, + "source": [ + "## Model Definition\n", + "The minimum requirement for creating a custom LUME-model is to implement the abstract `evaluate` method inherited from `LUMEBaseModel`. Here, we simply return the squared input." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "f96d9863-269c-49d8-9671-cc73a783bcbc", + "metadata": {}, + "outputs": [], + "source": [ + "class ExampleModel(LUMEBaseModel):\n", + " def evaluate(self, input_dict):\n", + " output_dict = {\n", + " \"output1\": input_dict[self.input_variables[0].name] ** 2,\n", + " \"output2\": input_dict[self.input_variables[1].name] ** 2,\n", + " }\n", + " return output_dict" + ] + }, + { + "cell_type": "markdown", + "id": "868fff4d-1f46-48e2-8bd0-c9d831df79e6", + "metadata": {}, + "source": [ + "## Model Instantiation and Execution\n", + "Instantiation requires specification of the input and output variables of the model." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "97946e64-062d-47d4-8d0c-d7e02a335a56", + "metadata": {}, + "outputs": [], + "source": [ + "input_variables = [\n", + " ScalarInputVariable(name=\"input1\", default=0.1, value_range=[0.0, 1.0]),\n", + " ScalarInputVariable(name=\"input2\", default=0.2, value_range=[0.0, 1.0]),\n", + "]\n", + "output_variables = [\n", + " ScalarOutputVariable(name=\"output1\"),\n", + " ScalarOutputVariable(name=\"output2\"),\n", + "]\n", + "\n", + "m = ExampleModel(input_variables=input_variables, output_variables=output_variables)" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "50aae4be-0d6e-456f-83e8-3a84d6d78f84", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'output1': 0.09, 'output2': 0.36}" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "input_dict = {\n", + " \"input1\": 0.3,\n", + " \"input2\": 0.6,\n", + "}\n", + "m.evaluate(input_dict)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6a547f3c-1706-4b32-bab6-9687627f6a78", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python [conda env:lume-model-dev]", + "language": "python", + "name": "conda-env-lume-model-dev-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.18" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/examples/files/example_input_image.npy b/examples/files/example_input_image.npy deleted file mode 100644 index 9924b13..0000000 Binary files a/examples/files/example_input_image.npy and /dev/null differ diff --git a/examples/files/iris_config.yml b/examples/files/iris_config.yml deleted file mode 100644 index 7f0ce6f..0000000 --- a/examples/files/iris_config.yml +++ /dev/null @@ -1,39 +0,0 @@ - -model: - model_class: lume_model.keras.KerasModel - requirements: - tensorflow: 2.3.1 - kwargs: - model_file: examples/files/iris_model.h5 - output_format: - type: softmax - -input_variables: - SepalLength: - name: SepalLength - type: scalar - default: 4.3 - range: [4.3, 7.9] - - SepalWidth: - name: SepalWidth - type: scalar - default: 2.0 - range: [2.0, 6.9] - - PetalLength: - name: PetalLength - type: scalar - default: 1.0 - range: [1.0, 6.9] - - PetalWidth: - name: PetalWidth - type: scalar - default: 0.1 - range: [0.1, 2.5] - -output_variables: - Species: - name: Species - type: scalar diff --git a/examples/files/iris_model.h5 b/examples/files/iris_model.h5 deleted file mode 100644 index 692593d..0000000 Binary files a/examples/files/iris_model.h5 and /dev/null differ diff --git a/examples/iris_model.py b/examples/iris_model.py deleted file mode 100644 index b2f731c..0000000 --- a/examples/iris_model.py +++ /dev/null @@ -1,9 +0,0 @@ -""" -Adaptation of tensorflow tutorial: https://www.tensorflow.org/tutorials/estimator/premade -""" -from lume_model.utils import model_from_yaml - -with open("examples/files/iris_config.yml", "r") as f: - model = model_from_yaml(f) - -model.random_evaluate() diff --git a/examples/keras_model.ipynb b/examples/keras_model.ipynb new file mode 100644 index 0000000..c9afcf7 --- /dev/null +++ b/examples/keras_model.ipynb @@ -0,0 +1,220 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "235c92cd-cc05-42b8-a516-1185eeac5f0c", + "metadata": {}, + "source": [ + "# Creating a KerasModel\n", + "Base models built in Keras are already supported by LUME-model. We demonstrate how to create and execute a `KerasModel` below." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "56725817-2b21-4bea-98b0-151dea959f77", + "metadata": {}, + "outputs": [], + "source": [ + "import keras\n", + "import numpy as np\n", + "\n", + "from lume_model.models import KerasModel\n", + "from lume_model.variables import ScalarInputVariable, ScalarOutputVariable" + ] + }, + { + "cell_type": "markdown", + "id": "79c62b18-7dc1-44ca-b578-4dea5cc4a4b4", + "metadata": {}, + "source": [ + "## Building a Model from Scratch\n", + "Instantiation of a `KerasModel` requires specification of the base model (`keras.Model` with named inputs) and in-/output variables." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "f96d9863-269c-49d8-9671-cc73a783bcbc", + "metadata": {}, + "outputs": [], + "source": [ + "# exemplary model definition\n", + "inputs = [keras.Input(name=\"input1\", shape=(1,)), keras.Input(name=\"input2\", shape=(1,))]\n", + "outputs = keras.layers.Dense(1, activation=keras.activations.relu)(keras.layers.concatenate(inputs))\n", + "base_model = keras.Model(inputs=inputs, outputs=outputs)\n", + "\n", + "# variable specification\n", + "input_variables = [\n", + " ScalarInputVariable(name=inputs[0].name, default=0.1, value_range=[0.0, 1.0]),\n", + " ScalarInputVariable(name=inputs[1].name, default=0.2, value_range=[0.0, 1.0]),\n", + "]\n", + "output_variables = [\n", + " ScalarOutputVariable(name=\"output\"),\n", + "]\n", + "\n", + "# creation of KerasModel\n", + "example_model = KerasModel(\n", + " model=base_model,\n", + " input_variables=input_variables,\n", + " output_variables=output_variables,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "d22e1cdd-0ea7-4a75-a2ed-47e6a77dac85", + "metadata": {}, + "source": [ + "## Loading a Model from File\n", + "An already created model can be saved to a YAML file by calling the `dump` method. The model can then be loaded by simply passing the file to the constructor." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "b32234ad-adcb-4431-940b-e5377cfa4e4a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "model_class: KerasModel\n", + "input_variables:\n", + " SepalLength:\n", + " variable_type: scalar\n", + " default: 4.3\n", + " is_constant: false\n", + " value_range: [4.3, 7.9]\n", + " SepalWidth:\n", + " variable_type: scalar\n", + " default: 2.0\n", + " is_constant: false\n", + " value_range: [2.0, 6.9]\n", + " PetalLength:\n", + " variable_type: scalar\n", + " default: 1.0\n", + " is_constant: false\n", + " value_range: [1.0, 6.9]\n", + " PetalWidth:\n", + " variable_type: scalar\n", + " default: 0.1\n", + " is_constant: false\n", + " value_range: [0.1, 2.5]\n", + "output_variables:\n", + " Species: {variable_type: scalar}\n", + "model: model.keras\n", + "output_format: array\n", + "output_transforms: [softmax]\n", + "\n" + ] + } + ], + "source": [ + "keras_model = KerasModel(\"../tests/test_files/iris_classification/keras_model.yml\")\n", + "print(keras_model.yaml())" + ] + }, + { + "cell_type": "markdown", + "id": "868fff4d-1f46-48e2-8bd0-c9d831df79e6", + "metadata": {}, + "source": [ + "## Model Execution\n", + "Calling the `evaluate` method allows for model execution on dictionary input." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "97946e64-062d-47d4-8d0c-d7e02a335a56", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'SepalLength': array([7.40696632]),\n", + " 'SepalWidth': array([6.5843979]),\n", + " 'PetalLength': array([1.06113014]),\n", + " 'PetalWidth': array([1.31041352])}" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# generate exemplary input\n", + "input_dict = keras_model.random_input(n_samples=1)\n", + "input_dict" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "50aae4be-0d6e-456f-83e8-3a84d6d78f84", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "1/1 [==============================] - 0s 45ms/step\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2023-11-09 11:53:54.522723: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz\n" + ] + }, + { + "data": { + "text/plain": [ + "{'Species': array(0)}" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# execute KerasModel\n", + "keras_model.evaluate(input_dict)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "55d12bdc-ed38-401d-8bf8-bea92f4456bc", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python [conda env:lume-model-dev]", + "language": "python", + "name": "conda-env-lume-model-dev-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.18" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/examples/torch_model.ipynb b/examples/torch_model.ipynb new file mode 100644 index 0000000..1de9bdd --- /dev/null +++ b/examples/torch_model.ipynb @@ -0,0 +1,266 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "235c92cd-cc05-42b8-a516-1185eeac5f0c", + "metadata": {}, + "source": [ + "# Creating a TorchModel\n", + "Base models built in PyTorch are already supported by LUME-model. We demonstrate how to create and execute a `TorchModel` below." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "56725817-2b21-4bea-98b0-151dea959f77", + "metadata": {}, + "outputs": [], + "source": [ + "import torch\n", + "\n", + "from lume_model.models import TorchModel, TorchModule\n", + "from lume_model.variables import ScalarInputVariable, ScalarOutputVariable" + ] + }, + { + "cell_type": "markdown", + "id": "79c62b18-7dc1-44ca-b578-4dea5cc4a4b4", + "metadata": {}, + "source": [ + "## Building a Model from Scratch\n", + "Instantiation of a `TorchModel` requires specification of the base model (`torch.nn.Module`) and in-/output variables." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "f96d9863-269c-49d8-9671-cc73a783bcbc", + "metadata": {}, + "outputs": [], + "source": [ + "# exemplary model definition\n", + "base_model = torch.nn.Sequential(\n", + " torch.nn.Linear(2, 1),\n", + ")\n", + "\n", + "# variable specification\n", + "input_variables = [\n", + " ScalarInputVariable(name=\"input1\", default=0.1, value_range=[0.0, 1.0]),\n", + " ScalarInputVariable(name=\"input2\", default=0.2, value_range=[0.0, 1.0]),\n", + "]\n", + "output_variables = [\n", + " ScalarOutputVariable(name=\"output\"),\n", + "]\n", + "\n", + "# creation of TorchModel\n", + "example_model = TorchModel(\n", + " model=base_model,\n", + " input_variables=input_variables,\n", + " output_variables=output_variables,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "d22e1cdd-0ea7-4a75-a2ed-47e6a77dac85", + "metadata": {}, + "source": [ + "## Loading a Model from File\n", + "An already created model can be saved to a YAML file by calling the `dump` method. The model can then be loaded by simply passing the file to the constructor." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "b32234ad-adcb-4431-940b-e5377cfa4e4a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "model_class: TorchModel\n", + "input_variables:\n", + " MedInc:\n", + " variable_type: scalar\n", + " default: 3.7857346534729004\n", + " is_constant: false\n", + " value_range: [0.4999000132083893, 15.000100135803223]\n", + " HouseAge:\n", + " variable_type: scalar\n", + " default: 29.282135009765625\n", + " is_constant: false\n", + " value_range: [1.0, 52.0]\n", + " AveRooms:\n", + " variable_type: scalar\n", + " default: 5.4074907302856445\n", + " is_constant: false\n", + " value_range: [0.8461538553237915, 141.90908813476562]\n", + " AveBedrms:\n", + " variable_type: scalar\n", + " default: 1.1071722507476807\n", + " is_constant: false\n", + " value_range: [0.375, 34.06666564941406]\n", + " Population:\n", + " variable_type: scalar\n", + " default: 1437.0687255859375\n", + " is_constant: false\n", + " value_range: [3.0, 28566.0]\n", + " AveOccup:\n", + " variable_type: scalar\n", + " default: 3.035413980484009\n", + " is_constant: false\n", + " value_range: [0.692307710647583, 599.7142944335938]\n", + " Latitude:\n", + " variable_type: scalar\n", + " default: 35.28323745727539\n", + " is_constant: false\n", + " value_range: [32.65999984741211, 41.95000076293945]\n", + " Longitude:\n", + " variable_type: scalar\n", + " default: -119.11573028564453\n", + " is_constant: false\n", + " value_range: [-124.3499984741211, -114.30999755859375]\n", + "output_variables:\n", + " MedHouseVal: {variable_type: scalar}\n", + "model: model.pt\n", + "input_transformers: [input_transformers_0.pt]\n", + "output_transformers: [output_transformers_0.pt]\n", + "output_format: tensor\n", + "device: cpu\n", + "fixed_model: true\n", + "\n" + ] + } + ], + "source": [ + "torch_model = TorchModel(\"../tests/test_files/california_regression/torch_model.yml\")\n", + "print(torch_model.yaml())" + ] + }, + { + "cell_type": "markdown", + "id": "868fff4d-1f46-48e2-8bd0-c9d831df79e6", + "metadata": {}, + "source": [ + "## Model Execution and TorchModule\n", + "Calling the `evaluate` method allows for model execution on dictionary input. Additionally, instances of `TorchModel` can also be wrapped in a `TorchModule` which is a subclass of `torch.nn.Module`. This allows for seamless integration with `PyTorch` based packages like [BoTorch](https://botorch.org/) and [Xopt](https://christophermayes.github.io/Xopt/)." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "97946e64-062d-47d4-8d0c-d7e02a335a56", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'MedInc': tensor([11.2651]),\n", + " 'HouseAge': tensor([44.3406]),\n", + " 'AveRooms': tensor([130.5891]),\n", + " 'AveBedrms': tensor([19.3163]),\n", + " 'Population': tensor([11930.1680]),\n", + " 'AveOccup': tensor([212.5965]),\n", + " 'Latitude': tensor([37.1786]),\n", + " 'Longitude': tensor([-114.7374])}" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# generate exemplary input\n", + "input_dict = torch_model.random_input(n_samples=1)\n", + "input_dict" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "50aae4be-0d6e-456f-83e8-3a84d6d78f84", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'MedHouseVal': tensor(-2.4484, dtype=torch.float64)}" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# execute TorchModel\n", + "torch_model.evaluate(input_dict)" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "59e132a2-6e6d-41a6-9912-c26d151d4821", + "metadata": {}, + "outputs": [], + "source": [ + "# wrap in TorchModule\n", + "torch_module = TorchModule(model=torch_model)" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "eb18b86d-8371-441c-a4c2-8d6e124a57d5", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "tensor(-2.4484, dtype=torch.float64)" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# execute TorchModule\n", + "input_tensor = torch.tensor([input_dict[k] for k in torch_module.input_order]).unsqueeze(0)\n", + "torch_module(input_tensor)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "55d12bdc-ed38-401d-8bf8-bea92f4456bc", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python [conda env:lume-model-dev]", + "language": "python", + "name": "conda-env-lume-model-dev-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.18" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/lume_model/base.py b/lume_model/base.py index 0aa2027..3f3338a 100644 --- a/lume_model/base.py +++ b/lume_model/base.py @@ -1,11 +1,12 @@ import os import json -import yaml import logging from abc import ABC, abstractmethod -from typing import Any, Callable, Union, TextIO +from typing import Any, Callable, Union from types import FunctionType, MethodType +from io import TextIOWrapper +import yaml import numpy as np from pydantic import BaseModel, ConfigDict, field_validator, SerializeAsAny @@ -19,6 +20,7 @@ serialize_variables, deserialize_variables, variables_from_dict, + replace_relative_paths, ) logger = logging.getLogger(__name__) @@ -55,14 +57,15 @@ def process_torch_module( Filename under which the torch module is (or would be) saved. """ torch = try_import_module("torch") - prefixes = [ele for ele in [file_prefix, base_key] if not ele == ""] - if not prefixes: - module_name = "{}.pt".format(key) - else: - module_name = "{}.pt".format("_".join((*prefixes, key))) + filepath_prefix, filename_prefix = os.path.split(file_prefix) + prefixes = [ele for ele in [filename_prefix, base_key] if not ele == ""] + filename = "{}.pt".format(key) + if prefixes: + filename = "_".join((*prefixes, filename)) + filepath = os.path.join(filepath_prefix, filename) if save_modules: - torch.save(module, module_name) - return module_name + torch.save(module, filepath) + return filename def process_keras_model( @@ -135,7 +138,6 @@ def recursive_serialize( for _type, func in JSON_ENCODERS.items(): if isinstance(value, _type): v[key] = func(value) - # check to make sure object has been serialized, if not use a generic serializer try: json.dumps(v[key]) @@ -199,24 +201,36 @@ def json_loads(v): return v -def parse_config(config: Union[dict, str]) -> dict: +def parse_config( + config: Union[dict, str, TextIOWrapper, os.PathLike], + model_fields: dict = None, +) -> dict: """Parses model configuration and returns keyword arguments for model constructor. Args: - config: Model configuration as dictionary, YAML or JSON formatted string or file path. + config: Model configuration as dictionary, YAML or JSON formatted string, file or file path. + model_fields: Fields expected by the model (required for replacing relative paths). Returns: Configuration as keyword arguments for model constructor. """ - if isinstance(config, str): - if os.path.exists(config): + config_file = None + if isinstance(config, dict): + d = config + else: + if isinstance(config, TextIOWrapper): + yaml_str = config.read() + config_file = os.path.abspath(config.name) + elif isinstance(config, (str, os.PathLike)) and os.path.exists(config): with open(config) as f: yaml_str = f.read() + config_file = os.path.abspath(config) else: yaml_str = config d = recursive_deserialize(yaml.safe_load(yaml_str)) - else: - d = config + if config_file is not None: + config_dir = os.path.dirname(os.path.realpath(config_file)) + d = replace_relative_paths(d, model_fields, config_dir) return model_kwargs_from_dict(d) @@ -233,7 +247,7 @@ def model_kwargs_from_dict(config: dict) -> dict: if all(key in config.keys() for key in ["input_variables", "output_variables"]): config["input_variables"], config["output_variables"] = variables_from_dict( config) - _ = config.pop("model_class", None) + config.pop("model_class", None) return config @@ -297,7 +311,7 @@ def __init__(self, *args, **kwargs): if len(args) == 1: if len(kwargs) > 0: raise ValueError("Cannot specify YAML string and keyword arguments for LUMEBaseModel init.") - super().__init__(**parse_config(args[0])) + super().__init__(**parse_config(args[0], self.model_fields)) elif len(args) > 1: raise ValueError( "Arguments to LUMEBaseModel must be either a single YAML string " @@ -334,7 +348,6 @@ def json(self, **kwargs) -> str: result = self.to_json(**kwargs) config = json.loads(result) config = {"model_class": self.__class__.__name__} | config - return json.dumps(config) def yaml( @@ -377,7 +390,7 @@ def dump( base_key: Base key for serialization. save_models: Determines whether models are saved to file. """ - file_prefix = os.path.splitext(file)[0] + file_prefix = os.path.splitext(os.path.abspath(file))[0] with open(file, "w") as f: f.write( self.yaml( @@ -390,13 +403,10 @@ def dump( @classmethod def from_file(cls, filename: str): if not os.path.exists(filename): - raise OSError(f"file {filename} is not found") - + raise OSError(f"File {filename} is not found.") with open(filename, "r") as file: return cls.from_yaml(file) @classmethod - def from_yaml(cls, yaml_obj: [str, TextIO]): - return cls.model_validate(yaml.safe_load(yaml_obj)) - - + def from_yaml(cls, yaml_obj: [str, TextIOWrapper]): + return cls.model_validate(parse_config(yaml_obj, cls.model_fields)) diff --git a/lume_model/models/__init__.py b/lume_model/models/__init__.py index cce0457..4e2664e 100644 --- a/lume_model/models/__init__.py +++ b/lume_model/models/__init__.py @@ -1,6 +1,6 @@ import os import yaml -from typing import Any, Union +from typing import Union registered_models = [] @@ -35,19 +35,6 @@ def get_model(name: str): return model_lookup[name] -def model_from_dict(config: dict[str, Any]): - """Creates LUME model from the given configuration dictionary. - - Args: - config: Model configuration dictionary. - - Returns: - Created LUME model. - """ - model_class = get_model(config["model_class"]) - return model_class(config) - - def model_from_yaml(yaml_str: Union[str, os.PathLike]): """Creates LUME model from the given YAML formatted string or file path. @@ -59,6 +46,8 @@ def model_from_yaml(yaml_str: Union[str, os.PathLike]): """ if os.path.exists(yaml_str): with open(yaml_str) as f: - yaml_str = f.read() - config = yaml.safe_load(yaml_str) - return model_from_dict(config) + config = yaml.safe_load(f.read()) + else: + config = yaml.safe_load(yaml_str) + model_class = get_model(config["model_class"]) + return model_class(yaml_str) diff --git a/lume_model/models/keras_model.py b/lume_model/models/keras_model.py index 6492b25..f46f375 100644 --- a/lume_model/models/keras_model.py +++ b/lume_model/models/keras_model.py @@ -51,7 +51,7 @@ def validate_keras_model(cls, v): if os.path.exists(v): v = keras.models.load_model(v) else: - raise ValueError(f"Path {v} does not exist!") + raise OSError(f"File {v} is not found.") return v @field_validator("output_format") diff --git a/lume_model/models/torch_model.py b/lume_model/models/torch_model.py index cc65b83..9dabf46 100644 --- a/lume_model/models/torch_model.py +++ b/lume_model/models/torch_model.py @@ -12,8 +12,6 @@ InputVariable, OutputVariable, ScalarInputVariable, - # ScalarOutputVariable, - # ImageOutputVariable, ) logger = logging.getLogger(__name__) @@ -32,9 +30,9 @@ class TorchModel(LUMEBaseModel): input_transformers: List of transformer objects to apply to input before passing to model. output_transformers: List of transformer objects to apply to output of model. output_format: Determines format of outputs: "tensor", "variable" or "raw". + device: Device on which the model will be evaluated. Defaults to "cpu". fixed_model: If true, the model and transformers are put in evaluation mode and all gradient computation is deactivated. - device: Device on which the model will be evaluated. Defaults to "cpu". """ model: torch.nn.Module input_transformers: list[ReversibleInputTransform] = [] @@ -83,21 +81,22 @@ def validate_torch_model(cls, v): if os.path.exists(v): v = torch.load(v) else: - raise ValueError(f"Path {v} does not exist!") + raise OSError(f"File {v} is not found.") return v @field_validator("input_transformers", "output_transformers", mode="before") def validate_botorch_transformers(cls, v): if not isinstance(v, list): raise ValueError("Transformers must be passed as list.") - else: - loaded_transformers = [] - for t in v: - if isinstance(t, (str, os.PathLike)): - if os.path.exists(t): - t = torch.load(t) - loaded_transformers.append(t) - v = loaded_transformers + loaded_transformers = [] + for t in v: + if isinstance(t, (str, os.PathLike)): + if os.path.exists(t): + t = torch.load(t) + else: + raise OSError(f"File {t} is not found.") + loaded_transformers.append(t) + v = loaded_transformers return v @field_validator("output_format") diff --git a/lume_model/models/torch_module.py b/lume_model/models/torch_module.py index aea71e4..507f9ef 100644 --- a/lume_model/models/torch_module.py +++ b/lume_model/models/torch_module.py @@ -42,7 +42,8 @@ def __init__( if len(args) == 1: if not all(v is None for v in [model, input_order, output_order]): raise ValueError("Cannot specify YAML string and keyword arguments for TorchModule init.") - kwargs = parse_config(args[0]) + model_fields = {f"model.{k}": v for k, v in TorchModel.model_fields.items()} + kwargs = parse_config(args[0], model_fields) kwargs["model"] = TorchModel(kwargs["model"]) self.__init__(**kwargs) elif len(args) > 1: diff --git a/lume_model/utils.py b/lume_model/utils.py index 2ec2a87..4d7f34a 100644 --- a/lume_model/utils.py +++ b/lume_model/utils.py @@ -2,7 +2,7 @@ import sys import yaml import importlib -from typing import Union +from typing import Union, get_origin, get_args from lume_model.variables import ( InputVariable, @@ -154,3 +154,71 @@ def variables_from_yaml(yaml_obj: Union[str, os.PathLike]) -> tuple[list[InputVa yaml_str = yaml_obj config = deserialize_variables(yaml.safe_load(yaml_str)) return variables_from_dict(config) + + +def get_valid_path( + path: Union[str, os.PathLike], + directory: Union[str, os.PathLike] = "", +) -> Union[str, os.PathLike]: + """Validates path exists either as relative or absolute path and returns the first valid option. + + Args: + path: Path to validate. + directory: Directory against which relative paths are checked. + + Returns: + The first valid path option as an absolute path. + """ + relative_path = os.path.join(directory, path) + if os.path.exists(relative_path): + return os.path.abspath(relative_path) + elif os.path.exists(path): + return os.path.abspath(path) + else: + raise OSError(f"File {path} is not found.") + + +def replace_relative_paths( + d: dict, + model_fields: dict = None, + directory: Union[str, os.PathLike] = "", +) -> dict: + """Replaces dictionary entries with absolute paths where the model field annotation is not string or path-like. + + Args: + d: Dictionary to process. + model_fields: Model fields dictionary used to check expected type. + directory: Directory against which relative paths are checked. + + Returns: + Dictionary with replaced paths. + """ + if model_fields is None: + model_fields = {} + for k, v in d.items(): + if isinstance(v, (str, os.PathLike)): + if k in model_fields.keys(): + field_types = [model_fields[k].annotation] + if get_origin(model_fields[k].annotation) is Union: + field_types = list(get_args(model_fields[k].annotation)) + if all([t not in field_types for t in [str, os.PathLike]]): + d[k] = get_valid_path(v, directory) + elif isinstance(v, list): + if k in model_fields.keys(): + field_types = [] + for i, field_type in enumerate(get_args(model_fields[k].annotation)): + if get_origin(field_type) is Union: + field_types.extend(list(get_args(field_type))) + else: + field_types.append(field_type) + for i, ele in enumerate(v): + if (isinstance(ele, (str, os.PathLike)) and + all([t not in field_types for t in [str, os.PathLike]])): + v[i] = get_valid_path(ele, directory) + elif isinstance(v, dict): + model_subfields = { + ".".join(key.split(".")[1:]): value + for key, value in model_fields.items() if key.startswith(f"{k}.") + } + d[k] = replace_relative_paths(v, model_subfields, directory) + return d diff --git a/lume_model/variables.py b/lume_model/variables.py index 2814b81..129190e 100644 --- a/lume_model/variables.py +++ b/lume_model/variables.py @@ -1,218 +1,44 @@ """ -This module contains definitions of lume-model variables for use with lume tools. +This module contains definitions of LUME-model variables for use with lume tools. The variables are divided into input and outputs, each with different minimal requirements. Initiating any variable without the minimum requirements will result in an error. -Two types of variables are currently defined: Scalar and Image. Scalar variables hold -float type values. Image variables hold numpy array representations of images. +For now, only scalar variables (floats) are supported. """ - -import numpy as np import logging -from typing import Any, List, Union, Optional, Generic, TypeVar, Literal -from pydantic import BaseModel, Field, validator, ConfigDict +from typing import Optional, Generic, TypeVar +from pydantic import BaseModel, Field logger = logging.getLogger(__name__) -class NumpyNDArray(np.ndarray): - """ - Custom type validator for numpy ndarray. - """ - - @classmethod - def __get_validators__(cls): - yield cls.validate - - @classmethod - def validate(cls, v: Any) -> np.ndarray: - # validate data... - - if isinstance(v, list): - # conver to array, keep order - v = np.ndarray(v, order="K") - - if not isinstance(v, np.ndarray): - logger.exception("A numpy array is required for the value") - raise TypeError("Numpy array required") - return v - - class Config: - json_encoders = { - np.ndarray: lambda v: v.tolist(), # may lose some precision - } - - -class Image(np.ndarray): - """ - Custom type validator for image array. - - """ - - @classmethod - def __get_validators__(cls): - yield cls.validate - - @classmethod - def validate(cls, v: Any) -> np.ndarray: - # validate data... - if not isinstance(v, np.ndarray): - logger.exception("Image variable value must be a numpy array") - raise TypeError("Numpy array required") - - if (not v.ndim == 2 and not v.ndim == 3) or (v.ndim == 3 and v.shape[2] != 3): - logger.exception("Array must have dim=2 or dim=3 to instantiate image") - raise ValueError( - f"Image array must have dim=2 or dim=3. Provided array has {v.ndim} dimensions" - ) - - return v - - -class NDVariableBase: - """ - Holds properties associated with numpy array variables. - - Attributes: - shape (tuple): Shape of the numpy n-dimensional array - """ - - @property - def shape(self) -> tuple: - if self.default is not None: - return self.default.shape - else: - return None - - # define generic value type Value = TypeVar("Value") class Variable(BaseModel, Generic[Value]): """ - Minimum requirements for a Variable + Minimum requirements for a variable. Attributes: - name (str): Name of the variable. - - value (Optional[Value]): Value assigned to the variable - - precision (Optional[int]): Precision to use for the value - + name: Name of the variable. + value: Value assigned to the variable. + precision: Precision to use for the value. """ - name: str value: Optional[Value] = None precision: Optional[int] = None -class InputVariable(Variable, Generic[Value]): - """ - Base class for input variables. - - Attributes: - name (str): Name of the variable. - - default (Value): Default value assigned to the variable - - precision (Optional[int]): Precision to use for the value - - value (Optional[Value]): Value assigned to variable - - value_range (list): Acceptable range for value - - """ - - default: Value # required default - is_constant: bool = Field(False) - - -class OutputVariable(Variable, Generic[Value]): - """ - Base class for output variables. Value and range assignment are optional. - - Attributes: - name (str): Name of the variable. - - default (Optional[Value]): Default value assigned to the variable. - - precision (Optional[int]): Precision to use for the value. - - value (Optional[Value]): Value assigned to variable - - value_range (Optional[list]): Acceptable range for value - - """ - - default: Optional[Value] = None - value_range: Optional[list] = Field(None, alias="range") - - -class ImageVariable(BaseModel, NDVariableBase): - """ - Base class used for constructing an image variable. - - Attributes: - variable_type (str): Indicates image variable. - - axis_labels (List[str]): Labels to use for rendering axes. - - axis_units (Optional[List[str]]): Units to use for rendering axes labels. - - x_min_variable (Optional[str]): Scalar variable associated with image minimum x. - - x_max_variable (Optional[str]): Scalar variable associated with image maximum x. - - y_min_variable (Optional[str]): Scalar variable associated with image minimum y. - - y_max_variable (Optional[str]): Scalar variable associated with image maximum y. - """ - - variable_type: str = "image" - axis_labels: List[str] - axis_units: Optional[List[str]] - x_min_variable: Optional[str] - x_max_variable: Optional[str] - y_min_variable: Optional[str] - y_max_variable: Optional[str] - - -class ArrayVariable(BaseModel, NDVariableBase): - """ - Base class used for constructing an array variable. Array variables can capture - strings by passing `variable_type="string"` during initialization. Otherwise, the - value will default to an array of floats. - - Attributes: - variable_type (str): Indicates array variable. - - dim_labels (Optional[List[str]]): Labels to use for rendering axes. - - units (Optional[List[str]]): Units to use for rendering axes labels. - - value_type (Literal["float", "string"]): Type of value held by array. - - """ - - variable_type: str = "array" - units: Optional[List[str]] # required for some output displays - dim_labels: Optional[List[str]] - value_type: Literal["float", "string"] = "float" - - class ScalarVariable(BaseModel): """ Base class used for constructing a scalar variable. Attributes: - variable_type (tuple): Indicates scalar variable. - units (Optional[str]): Units associated with scalar value. - parent_variable (Optional[str]): Variable for which this is an attribute. - value_range (list): Acceptable range for value - + variable_type: Indicates scalar variable. + units: Units associated with scalar value. + parent_variable: Variable for which this is an attribute. """ - variable_type: str = "scalar" units: Optional[str] = None # required for some output displays parent_variable: str = ( @@ -220,289 +46,415 @@ class ScalarVariable(BaseModel): ) -class ImageInputVariable(InputVariable[Image], ImageVariable): +class InputVariable(Variable, Generic[Value]): """ - Variable used for representing an image input. Image variable values must be two or - three dimensional arrays (grayscale, color, respectively). Initialization requires - name, axis_labels, default, x_min, x_max, y_min, y_max. + Base class for input variables. Attributes: - - name (str): Name of the variable. - default (Value): Default value assigned to the variable. - precision (Optional[int]): Precision to use for the value. - value (Optional[Value]): Value assigned to variable - value_range (list): Acceptable range for value - variable_type (str): Indicates image variable. - axis_labels (List[str]): Labels to use for rendering axes. - axis_units (Optional[List[str]]): Units to use for rendering axes labels. - x_min (float): Minimum x value of image. - x_max (float): Maximum x value of image. - y_min (float): Minimum y value of image. - y_max (float): Maximum y value of image. - x_min_variable (Optional[str]): Scalar variable associated with image minimum x. - x_max_variable (Optional[str]): Scalar variable associated with image maximum x. - y_min_variable (Optional[str]): Scalar variable associated with image minimum y. - y_max_variable (Optional[str]): Scalar variable associated with image maximum y. - - - Example: - ``` - variable = ImageInputVariable( - name="test", - default=np.array([[1,4], [5,2]]), - value_range=[1, 10], - axis_labels=["count_1", "count_2"], - x_min=0, - y_min=0, - x_max=5, - y_max=5, - ) - ``` - + default: Default value assigned to the variable. + is_constant: Indicates whether the variable is constant. """ - - x_min: float - x_max: float - y_min: float - y_max: float + default: Value # required default + is_constant: bool = Field(False) -class ImageOutputVariable(OutputVariable[Image], ImageVariable): +class OutputVariable(Variable, Generic[Value]): """ - Variable used for representing an image output. Image variable values must be two or - three dimensional arrays (grayscale, color, respectively). Initialization requires - name and axis_labels. + Base class for output variables. Value and range assignment are optional. Attributes: - name (str): Name of the variable. - default (Optional[Value]): Default value assigned to the variable. - precision (Optional[int]): Precision to use for the value. - value (Optional[Value]): Value assigned to variable - value_range (Optional[list]): Acceptable range for value - variable_type (str): Indicates image variable. - axis_labels (List[str]): Labels to use for rendering axes. - axis_units (Optional[List[str]]): Units to use for rendering axes labels. - x_min (Optional[float]): Minimum x value of image. - x_max (Optional[float]): Maximum x value of image. - y_min (Optional[float]): Minimum y value of image. - y_max (Optional[float]): Maximum y value of image. - x_min_variable (Optional[str]): Scalar variable associated with image minimum x. - x_max_variable (Optional[str]): Scalar variable associated with image maximum x. - y_min_variable (Optional[str]): Scalar variable associated with image minimum y. - y_max_variable (Optional[str]): Scalar variable associated with image maximum y. - - Example: - ``` - variable = ImageOutputVariable( - name="test", - default=np.array([[2 , 1], [1, 4]]), - axis_labels=["count_1", "count_2"], - ) - - ``` - - + default: Default value assigned to the variable. + value_range: Acceptable range for value. """ - - x_min: Optional[float] = None - x_max: Optional[float] = None - y_min: Optional[float] = None - y_max: Optional[float] = None + default: Optional[Value] = None + value_range: Optional[list] = Field(None, alias="range") class ScalarInputVariable(InputVariable[float], ScalarVariable): """ - Variable used for representing an scalar input. Scalar variables hold float values. + Variable used for representing a scalar input. Scalar variables hold float values. Initialization requires name, default, and value_range. Attributes: - name (str): Name of the variable. - default (Value): Default value assigned to the variable - precision (Optional[int]): Precision to use for the value - value (Optional[Value]): Value assigned to variable - value_range (list): Acceptable range for value - variable_type (str): Indicates scalar variable. - units (Optional[str]): Units associated with scalar value. - parent_variable (Optional[str]): Variable for which this is an attribute. + value_range: Acceptable range for value. Example: ``` - variable = ScalarInputVariable(name="test", default=0.1, value_range=[1, 2]) - + variable = ScalarInputVariable(name="example_input", default=0.1, value_range=[0.0, 1.0]) ``` """ - value_range: list[float] class ScalarOutputVariable(OutputVariable[float], ScalarVariable): """ - Variable used for representing an scalar output. Scalar variables hold float values. + Variable used for representing a scalar output. Scalar variables hold float values. Initialization requires name. - Attributes: - name (str): Name of the variable. - default (Optional[Value]): Default value assigned to the variable. - precision (Optional[int]): Precision to use for the value. - value (Optional[Value]): Value assigned to variable. - value_range (Optional[list]): Acceptable range for value. - variable_type (str): Indicates scalar variable. - units (Optional[str]): Units associated with scalar value. - parent_variable (Optional[str]): Variable for which this is an attribute. - Example: ``` - variable = ScalarOutputVariable(name="test", default=0.1, value_range=[1, 2]) + variable = ScalarOutputVariable(name="example_output") ``` - - """ - pass - - -class ArrayInputVariable(InputVariable[NumpyNDArray], ArrayVariable): - """ - Variable used for representing an array input. Array variables can capture - strings by passing `variable_type="string"` during initialization. Otherwise, the - value will default to an array of floats. - - Attributes: - name (str): Name of the variable. - default (np.ndarray): Default value assigned to the variable. - precision (Optional[int]): Precision to use for the value. - value (Optional[Value]): Value assigned to variable - value_range (Optional[list]): Acceptable range for value - variable_type (str): Indicates array variable. - dim_labels (List[str]): Labels to use for dimensions - dim_units (Optional[List[str]]): Units to use for dimensions. - - """ - - pass - - -class ArrayOutputVariable(OutputVariable[NumpyNDArray], ArrayVariable): - """ - Variable used for representing an array output. Array variables can capture - strings by passing `variable_type="string"` during initialization. Otherwise, the - value will default to an array of floats. - - Attributes: - name (str): Name of the variable. - - default (Optional[np.ndarray]): Default value assigned to the variable. - - precision (Optional[int]): Precision to use for the value. - - value (Optional[Value]): Value assigned to variable - - value_range (Optional[list]): Acceptable range for value - - variable_type (str): Indicates array variable. - - dim_labels (List[str]): Labels to use for dimensions - - dim_units (Optional[List[str]]): Units to use for dimensions. """ - pass -class TableVariable(BaseModel): - """Table variables are used for creating tabular representations of data. Table variables should only be used for client tools. - - Attributes: - table_rows (Optional[List[str]]): List of rows to assign to array data. - table_data (dict): Dictionary representation of columns and rows. - rows (list): List of rows. - columns (list): List of columns. - """ - - table_rows: Optional[List[str]] = None - table_data: dict - - @property - def columns(self) -> tuple: - if self.table_data is not None: - return list(self.table_data.keys()) - else: - return None - - @validator("table_rows") - def validate_rows(cls, v): - if isinstance(v, list): - for val in v: - if not isinstance(val, str): - raise TypeError("Rows must be defined as strings") - - else: - raise TypeError("Rows must be passed as list") - - return v - - @validator("table_data") - def table_data_formatted(cls, v, values) -> dict: - passed_rows = values.get("table_rows", None) - # validate data... - if not isinstance(v, dict): - logger.exception( - "Must provide dictionary representation of table structure, outer level columns, inner level rows." - ) - raise TypeError("Dictionary required") - - # check that rows are represented in structure - for val in v.values(): - if not isinstance(val, (dict, ArrayVariable)): - logger.exception( - "Rows are not represented in structure. Structure should map column title to either dictionary of row names and values or array variables." - ) - raise TypeError( - "Rows are not represented in structure. Structure should map column title to either dictionary of row names and values or array variables." - ) - - if isinstance(val, ArrayVariable): - if passed_rows is None: - logger.exception("Must pass table_rows when using array variables.") - raise TypeError("Must pass table_rows when using array variables.") - - # shape must match length of passed rows - elif val.shape[0] != len(passed_rows): - raise TypeError( - "Array first dimension must match passed rows length." - ) - - # check row structures to make sure properly formatted - for val in v.values(): - - # check row dictionary - if isinstance(val, dict): - if val.get("variable_type", None) is None: - for row_val in val.values(): - if not isinstance(row_val, (dict, ScalarVariable, float)): - logger.exception( - "Row dictionary must map row names to ScalarVariables or float." - ) - raise TypeError( - "Row dictionary must map row names to ScalarVariables or float." - ) - - # check that row keys align - if isinstance(row_val, dict) and passed_rows is not None: - row_rep = row_val.keys() - for row in row_rep: - if row not in passed_rows: - raise TypeError( - f"Row {row} not in row list passed during construction." - ) - return v - - @property - def rows(self) -> tuple: - if self.table_rows is not None: - return self.table_rows - else: - struct_rows = [] - for col, row_item in self.table_data.items(): - if isinstance(row_item, dict): - struct_rows += list(row_item.keys()) - return list(set(struct_rows)) +# class NumpyNDArray(np.ndarray): +# """ +# Custom type validator for numpy ndarray. +# """ +# +# @classmethod +# def __get_validators__(cls): +# yield cls.validate +# +# @classmethod +# def validate(cls, v: Any) -> np.ndarray: +# # validate data... +# +# if isinstance(v, list): +# # conver to array, keep order +# v = np.ndarray(v, order="K") +# +# if not isinstance(v, np.ndarray): +# logger.exception("A numpy array is required for the value") +# raise TypeError("Numpy array required") +# return v +# +# class Config: +# json_encoders = { +# np.ndarray: lambda v: v.tolist(), # may lose some precision +# } + + +# class Image(np.ndarray): +# """ +# Custom type validator for image array. +# +# """ +# +# @classmethod +# def __get_validators__(cls): +# yield cls.validate +# +# @classmethod +# def validate(cls, v: Any) -> np.ndarray: +# # validate data... +# if not isinstance(v, np.ndarray): +# logger.exception("Image variable value must be a numpy array") +# raise TypeError("Numpy array required") +# +# if (not v.ndim == 2 and not v.ndim == 3) or (v.ndim == 3 and v.shape[2] != 3): +# logger.exception("Array must have dim=2 or dim=3 to instantiate image") +# raise ValueError( +# f"Image array must have dim=2 or dim=3. Provided array has {v.ndim} dimensions" +# ) +# +# return v + + +# class NDVariableBase: +# """ +# Holds properties associated with numpy array variables. +# +# Attributes: +# shape (tuple): Shape of the numpy n-dimensional array +# """ +# +# @property +# def shape(self) -> tuple: +# if self.default is not None: +# return self.default.shape +# else: +# return None + + +# class ImageVariable(BaseModel, NDVariableBase): +# """ +# Base class used for constructing an image variable. +# +# Attributes: +# variable_type (str): Indicates image variable. +# +# axis_labels (List[str]): Labels to use for rendering axes. +# +# axis_units (Optional[List[str]]): Units to use for rendering axes labels. +# +# x_min_variable (Optional[str]): Scalar variable associated with image minimum x. +# +# x_max_variable (Optional[str]): Scalar variable associated with image maximum x. +# +# y_min_variable (Optional[str]): Scalar variable associated with image minimum y. +# +# y_max_variable (Optional[str]): Scalar variable associated with image maximum y. +# """ +# +# variable_type: str = "image" +# axis_labels: List[str] +# axis_units: Optional[List[str]] +# x_min_variable: Optional[str] +# x_max_variable: Optional[str] +# y_min_variable: Optional[str] +# y_max_variable: Optional[str] + + +# class ArrayVariable(BaseModel, NDVariableBase): +# """ +# Base class used for constructing an array variable. Array variables can capture +# strings by passing `variable_type="string"` during initialization. Otherwise, the +# value will default to an array of floats. +# +# Attributes: +# variable_type (str): Indicates array variable. +# +# dim_labels (Optional[List[str]]): Labels to use for rendering axes. +# +# units (Optional[List[str]]): Units to use for rendering axes labels. +# +# value_type (Literal["float", "string"]): Type of value held by array. +# +# """ +# +# variable_type: str = "array" +# units: Optional[List[str]] # required for some output displays +# dim_labels: Optional[List[str]] +# value_type: Literal["float", "string"] = "float" + + +# class ImageInputVariable(InputVariable[Image], ImageVariable): +# """ +# Variable used for representing an image input. Image variable values must be two or +# three dimensional arrays (grayscale, color, respectively). Initialization requires +# name, axis_labels, default, x_min, x_max, y_min, y_max. +# +# Attributes: +# +# name (str): Name of the variable. +# default (Value): Default value assigned to the variable. +# precision (Optional[int]): Precision to use for the value. +# value (Optional[Value]): Value assigned to variable +# value_range (list): Acceptable range for value +# variable_type (str): Indicates image variable. +# axis_labels (List[str]): Labels to use for rendering axes. +# axis_units (Optional[List[str]]): Units to use for rendering axes labels. +# x_min (float): Minimum x value of image. +# x_max (float): Maximum x value of image. +# y_min (float): Minimum y value of image. +# y_max (float): Maximum y value of image. +# x_min_variable (Optional[str]): Scalar variable associated with image minimum x. +# x_max_variable (Optional[str]): Scalar variable associated with image maximum x. +# y_min_variable (Optional[str]): Scalar variable associated with image minimum y. +# y_max_variable (Optional[str]): Scalar variable associated with image maximum y. +# +# +# Example: +# ``` +# variable = ImageInputVariable( +# name="test", +# default=np.array([[1,4], [5,2]]), +# value_range=[1, 10], +# axis_labels=["count_1", "count_2"], +# x_min=0, +# y_min=0, +# x_max=5, +# y_max=5, +# ) +# ``` +# +# """ +# +# x_min: float +# x_max: float +# y_min: float +# y_max: float + + +# class ImageOutputVariable(OutputVariable[Image], ImageVariable): +# """ +# Variable used for representing an image output. Image variable values must be two or +# three dimensional arrays (grayscale, color, respectively). Initialization requires +# name and axis_labels. +# +# Attributes: +# name (str): Name of the variable. +# default (Optional[Value]): Default value assigned to the variable. +# precision (Optional[int]): Precision to use for the value. +# value (Optional[Value]): Value assigned to variable +# value_range (Optional[list]): Acceptable range for value +# variable_type (str): Indicates image variable. +# axis_labels (List[str]): Labels to use for rendering axes. +# axis_units (Optional[List[str]]): Units to use for rendering axes labels. +# x_min (Optional[float]): Minimum x value of image. +# x_max (Optional[float]): Maximum x value of image. +# y_min (Optional[float]): Minimum y value of image. +# y_max (Optional[float]): Maximum y value of image. +# x_min_variable (Optional[str]): Scalar variable associated with image minimum x. +# x_max_variable (Optional[str]): Scalar variable associated with image maximum x. +# y_min_variable (Optional[str]): Scalar variable associated with image minimum y. +# y_max_variable (Optional[str]): Scalar variable associated with image maximum y. +# +# Example: +# ``` +# variable = ImageOutputVariable( +# name="test", +# default=np.array([[2 , 1], [1, 4]]), +# axis_labels=["count_1", "count_2"], +# ) +# +# ``` +# +# +# """ +# +# x_min: Optional[float] = None +# x_max: Optional[float] = None +# y_min: Optional[float] = None +# y_max: Optional[float] = None + + +# class ArrayInputVariable(InputVariable[NumpyNDArray], ArrayVariable): +# """ +# Variable used for representing an array input. Array variables can capture +# strings by passing `variable_type="string"` during initialization. Otherwise, the +# value will default to an array of floats. +# +# Attributes: +# name (str): Name of the variable. +# default (np.ndarray): Default value assigned to the variable. +# precision (Optional[int]): Precision to use for the value. +# value (Optional[Value]): Value assigned to variable +# value_range (Optional[list]): Acceptable range for value +# variable_type (str): Indicates array variable. +# dim_labels (List[str]): Labels to use for dimensions +# dim_units (Optional[List[str]]): Units to use for dimensions. +# +# """ +# +# pass + + +# class ArrayOutputVariable(OutputVariable[NumpyNDArray], ArrayVariable): +# """ +# Variable used for representing an array output. Array variables can capture +# strings by passing `variable_type="string"` during initialization. Otherwise, the +# value will default to an array of floats. +# +# Attributes: +# name (str): Name of the variable. +# +# default (Optional[np.ndarray]): Default value assigned to the variable. +# +# precision (Optional[int]): Precision to use for the value. +# +# value (Optional[Value]): Value assigned to variable +# +# value_range (Optional[list]): Acceptable range for value +# +# variable_type (str): Indicates array variable. +# +# dim_labels (List[str]): Labels to use for dimensions +# +# dim_units (Optional[List[str]]): Units to use for dimensions. +# """ +# +# pass + + +# class TableVariable(BaseModel): +# """Table variables are used for creating tabular representations of data. Table variables should only be used for client tools. +# +# Attributes: +# table_rows (Optional[List[str]]): List of rows to assign to array data. +# table_data (dict): Dictionary representation of columns and rows. +# rows (list): List of rows. +# columns (list): List of columns. +# """ +# +# table_rows: Optional[List[str]] = None +# table_data: dict +# +# @property +# def columns(self) -> tuple: +# if self.table_data is not None: +# return list(self.table_data.keys()) +# else: +# return None +# +# @validator("table_rows") +# def validate_rows(cls, v): +# if isinstance(v, list): +# for val in v: +# if not isinstance(val, str): +# raise TypeError("Rows must be defined as strings") +# +# else: +# raise TypeError("Rows must be passed as list") +# +# return v +# +# @validator("table_data") +# def table_data_formatted(cls, v, values) -> dict: +# passed_rows = values.get("table_rows", None) +# # validate data... +# if not isinstance(v, dict): +# logger.exception( +# "Must provide dictionary representation of table structure, outer level columns, inner level rows." +# ) +# raise TypeError("Dictionary required") +# +# # check that rows are represented in structure +# for val in v.values(): +# if not isinstance(val, (dict, ArrayVariable)): +# logger.exception( +# "Rows are not represented in structure. Structure should map column title to either dictionary of row names and values or array variables." +# ) +# raise TypeError( +# "Rows are not represented in structure. Structure should map column title to either dictionary of row names and values or array variables." +# ) +# +# if isinstance(val, ArrayVariable): +# if passed_rows is None: +# logger.exception("Must pass table_rows when using array variables.") +# raise TypeError("Must pass table_rows when using array variables.") +# +# # shape must match length of passed rows +# elif val.shape[0] != len(passed_rows): +# raise TypeError( +# "Array first dimension must match passed rows length." +# ) +# +# # check row structures to make sure properly formatted +# for val in v.values(): +# +# # check row dictionary +# if isinstance(val, dict): +# if val.get("variable_type", None) is None: +# for row_val in val.values(): +# if not isinstance(row_val, (dict, ScalarVariable, float)): +# logger.exception( +# "Row dictionary must map row names to ScalarVariables or float." +# ) +# raise TypeError( +# "Row dictionary must map row names to ScalarVariables or float." +# ) +# +# # check that row keys align +# if isinstance(row_val, dict) and passed_rows is not None: +# row_rep = row_val.keys() +# for row in row_rep: +# if row not in passed_rows: +# raise TypeError( +# f"Row {row} not in row list passed during construction." +# ) +# return v +# +# @property +# def rows(self) -> tuple: +# if self.table_rows is not None: +# return self.table_rows +# else: +# struct_rows = [] +# for col, row_item in self.table_data.items(): +# if isinstance(row_item, dict): +# struct_rows += list(row_item.keys()) +# return list(set(struct_rows)) diff --git a/mkdocs.yml b/mkdocs.yml index 2a2f76b..0e1f7cb 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,24 +1,60 @@ site_name: lume-model site_url: https://slaclab.github.io/lume-model repo_url: https://github.com/slaclab/lume-model +repo_name: slaclab/lume-model + nav: - Home: index.md - Variables: variables.md - Models: models.md - - Utils: utils.md -theme: material + - Utilities: utils.md + +theme: + icon: + repo: fontawesome/brands/github + name: material + features: + - navigation.top + - navigation.tabs + - navigation.indexes + palette: + - media: "(prefers-color-scheme: light)" + scheme: default + primary: black + toggle: + icon: material/toggle-switch-off-outline + name: Switch to dark mode + - media: "(prefers-color-scheme: dark)" + scheme: slate + primary: black + toggle: + icon: material/toggle-switch + name: Switch to light mode + +extra: + generator: false + social: + - icon: fontawesome/brands/github + link: https://github.com/slaclab/lume-model + name: LUME-model + plugins: + - search - mkdocstrings: default_handler: python handlers: python: - selection: + options: + docstring_style: "google" inherited_members: false filters: - - "!^_" # exlude all members starting with _ + - "!^_" # exclude all members starting with _ - "^__init__$" # but always include __init__ modules and methods - rendering: + show_bases: true show_source: true + show_root_heading: false + show_root_toc_entry: false + markdown_extensions: - pymdownx.highlight - pymdownx.superfences diff --git a/requirements.txt b/requirements.txt index d7fdd15..58df03c 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,3 +1,3 @@ -pydantic +pydantic>2.3 numpy pyyaml diff --git a/tests/test_files/california_regression/torch_model.yml b/tests/test_files/california_regression/torch_model.yml index 7d2c354..b4888cf 100644 --- a/tests/test_files/california_regression/torch_model.yml +++ b/tests/test_files/california_regression/torch_model.yml @@ -42,9 +42,9 @@ input_variables: is_constant: false output_variables: MedHouseVal: {variable_type: scalar} -model: tests/test_files/california_regression/model.pt -input_transformers: [tests/test_files/california_regression/input_transformers_0.pt] -output_transformers: [tests/test_files/california_regression/output_transformers_0.pt] +model: model.pt +input_transformers: [input_transformers_0.pt] +output_transformers: [output_transformers_0.pt] output_format: tensor device: cpu fixed_model: true diff --git a/tests/test_files/california_regression/torch_module.yml b/tests/test_files/california_regression/torch_module.yml index dfdacb6..355caa8 100644 --- a/tests/test_files/california_regression/torch_module.yml +++ b/tests/test_files/california_regression/torch_module.yml @@ -46,9 +46,9 @@ model: is_constant: false output_variables: MedHouseVal: {variable_type: scalar} - model: tests/test_files/california_regression/model.pt - input_transformers: [tests/test_files/california_regression/input_transformers_0.pt] - output_transformers: [tests/test_files/california_regression/output_transformers_0.pt] + model: model.pt + input_transformers: [input_transformers_0.pt] + output_transformers: [output_transformers_0.pt] output_format: tensor device: cpu fixed_model: true diff --git a/tests/test_files/iris_classification/keras_model.yml b/tests/test_files/iris_classification/keras_model.yml index 077a472..27c556b 100644 --- a/tests/test_files/iris_classification/keras_model.yml +++ b/tests/test_files/iris_classification/keras_model.yml @@ -22,6 +22,6 @@ input_variables: is_constant: false output_variables: Species: {variable_type: scalar} -model: tests/test_files/iris_classification/model.keras +model: model.keras output_format: array output_transforms: [softmax] diff --git a/tests/test_variables.py b/tests/test_variables.py index 447a33c..ad53f0f 100644 --- a/tests/test_variables.py +++ b/tests/test_variables.py @@ -9,7 +9,7 @@ # ImageOutputVariable, # ArrayInputVariable, # ArrayOutputVariable, - TableVariable, + # TableVariable, ) @@ -292,107 +292,107 @@ def test_output_scalar_variable(variable_name, default, value_range): # ) -@pytest.mark.parametrize( - "rows,variables", - [ - ( - None, - { - "col1": { - "row1": ScalarInputVariable( - name="col1_row1", default=0,value_range=[-1, -1] - ), - "row2": ScalarInputVariable( - name="col1_row2", default=0,value_range=[-1, 1] - ), - }, - "col2": { - "row1": ScalarInputVariable( - name="col2_row1", default=0,value_range=[-1, -1] - ), - "row2": ScalarInputVariable( - name="col2_row2", default=0,value_range=[-1, 1] - ), - }, - }, - ), - pytest.param( - ["row1", "row2"], - { - "col1": { - "row1": ScalarInputVariable( - name="col1_row1", default=0, value_range=[-1, -1] - ), - "row2": 5, - }, - "col2": { - "row1": ScalarInputVariable( - name="col2_row1", default=0,value_range=[-1, -1] - ), - "row2": ScalarInputVariable( - name="col2_row2", default=0,value_range=[-1, 1] - ), - }, - }, - marks=pytest.mark.xfail, - ), - # pytest.param( - # None, - # { - # "col1": ArrayInputVariable( - # name="test", default=np.array([1, 2]), value_range=[0, 10] - # ), - # "col2": { - # "row1": ScalarInputVariable( - # name="col2_row1", default=0, value_range=[-1, -1] - # ), - # "row2": ScalarInputVariable( - # name="col2_row2", default=0, value_range=[-1, 1] - # ), - # }, - # }, - # marks=pytest.mark.xfail, - # ), - # ( - # ["row1", "row2"], - # { - # "col1": ArrayInputVariable( - # name="test", default=np.array([1, 2]), value_range=[0, 10] - # ), - # "col2": { - # "row1": ScalarInputVariable( - # name="col2_row1", default=0, value_range=[-1, -1] - # ), - # "row2": ScalarInputVariable( - # name="col2_row2", default=0, value_range=[-1, 1] - # ), - # }, - # }, - # ), - # pytest.param( - # ["row1", "row2"], - # { - # "col1": ArrayInputVariable( - # name="test", default=np.array([1, 2, 3, 4]), value_range=[0, 10] - # ), - # "col2": { - # "row1": ScalarInputVariable( - # name="col2_row1", default=0, value_range=[-1, -1] - # ), - # "row2": ScalarInputVariable( - # name="col2_row2", default=0, value_range=[-1, 1] - # ), - # }, - # }, - # marks=pytest.mark.xfail, - # ), - ], -) -def test_variable_table(rows, variables): - if rows: - table_var = TableVariable(table_rows=rows, table_data=variables) - else: - table_var = TableVariable(table_data=variables) - - with pytest.raises(ValueError): - table_var = TableVariable(table_data=None) +# @pytest.mark.parametrize( +# "rows,variables", +# [ +# ( +# None, +# { +# "col1": { +# "row1": ScalarInputVariable( +# name="col1_row1", default=0,value_range=[-1, -1] +# ), +# "row2": ScalarInputVariable( +# name="col1_row2", default=0,value_range=[-1, 1] +# ), +# }, +# "col2": { +# "row1": ScalarInputVariable( +# name="col2_row1", default=0,value_range=[-1, -1] +# ), +# "row2": ScalarInputVariable( +# name="col2_row2", default=0,value_range=[-1, 1] +# ), +# }, +# }, +# ), +# pytest.param( +# ["row1", "row2"], +# { +# "col1": { +# "row1": ScalarInputVariable( +# name="col1_row1", default=0, value_range=[-1, -1] +# ), +# "row2": 5, +# }, +# "col2": { +# "row1": ScalarInputVariable( +# name="col2_row1", default=0,value_range=[-1, -1] +# ), +# "row2": ScalarInputVariable( +# name="col2_row2", default=0,value_range=[-1, 1] +# ), +# }, +# }, +# marks=pytest.mark.xfail, +# ), +# pytest.param( +# None, +# { +# "col1": ArrayInputVariable( +# name="test", default=np.array([1, 2]), value_range=[0, 10] +# ), +# "col2": { +# "row1": ScalarInputVariable( +# name="col2_row1", default=0, value_range=[-1, -1] +# ), +# "row2": ScalarInputVariable( +# name="col2_row2", default=0, value_range=[-1, 1] +# ), +# }, +# }, +# marks=pytest.mark.xfail, +# ), +# ( +# ["row1", "row2"], +# { +# "col1": ArrayInputVariable( +# name="test", default=np.array([1, 2]), value_range=[0, 10] +# ), +# "col2": { +# "row1": ScalarInputVariable( +# name="col2_row1", default=0, value_range=[-1, -1] +# ), +# "row2": ScalarInputVariable( +# name="col2_row2", default=0, value_range=[-1, 1] +# ), +# }, +# }, +# ), +# pytest.param( +# ["row1", "row2"], +# { +# "col1": ArrayInputVariable( +# name="test", default=np.array([1, 2, 3, 4]), value_range=[0, 10] +# ), +# "col2": { +# "row1": ScalarInputVariable( +# name="col2_row1", default=0, value_range=[-1, -1] +# ), +# "row2": ScalarInputVariable( +# name="col2_row2", default=0, value_range=[-1, 1] +# ), +# }, +# }, +# marks=pytest.mark.xfail, +# ), +# ], +# ) +# def test_variable_table(rows, variables): +# if rows: +# table_var = TableVariable(table_rows=rows, table_data=variables) +# else: +# table_var = TableVariable(table_data=variables) +# +# with pytest.raises(ValueError): +# table_var = TableVariable(table_data=None)