Merge pull request #631 from BlockScience/0.4.2

0.4.2
BlockScience · Feb 3, 2025 · f2884fc · f2884fc
2 parents bf10869 + 70ca645
commit f2884fc
Show file tree

Hide file tree

Showing 12 changed files with 397 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -12,7 +12,7 @@ One good example is the [wiring report](https://github.com/BlockScience/Predator
 
 ## Installing the library
 
-To install the library, simply pip install by running "pip install math_spec_mapping"
+To install the library, simply pip install by running "pip install math-spec-mapping". The pypi package can be found [here](https://pypi.org/project/math-spec-mapping/).
 
 ## Why MSML?
 

diff --git a/dev/TypedDictionary.ipynb b/dev/TypedDictionary.ipynb
@@ -0,0 +1,147 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 31,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import TypedDict\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "a = TypedDict(\"EntityType\", {\"Words\": str, \"Total Length\": int})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 39,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "a.__repr__ = lambda x: \"A\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 48,
+   "metadata": {},
+   "outputs": [
+    {
+     "ename": "TypeError",
+     "evalue": "dict expected at most 1 argument, got 2",
+     "output_type": "error",
+     "traceback": [
+      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
+      "\u001b[0;32m/var/folders/y0/fwkpk2ps087b_2qxvhjstrfr0000gn/T/ipykernel_57262/3336926982.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      7\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 8\u001b[0;31m \u001b[0mTypedDict2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"EntityType\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m{\u001b[0m\u001b[0;34m\"Words\"\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"Total Length\"\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+      "\u001b[0;31mTypeError\u001b[0m: dict expected at most 1 argument, got 2"
+     ]
+    }
+   ],
+   "source": [
+    "class TypedDict2(TypedDict):\n",
+    "    def __init__(self, typename, fields):\n",
+    "        super().__init__(typename, fields)\n",
+    "    def __repr__(self) -> str:\n",
+    "        return f\"CustomTypedDict({super().__repr__()})\"\n",
+    "    \n",
+    "    \n",
+    "TypedDict2(\"EntityType\", {\"Words\": str, \"Total Length\": int})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\"Volume_per_Unit\"\n",
+    "\n",
+    "\"VolumeperUnit\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 49,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'Words': str, 'Total Length': int}"
+      ]
+     },
+     "execution_count": 49,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "TypedDict2({\"Words\": str, \"Total Length\": int})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 50,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class Test(TypedDict):\n",
+    "    Words: str\n",
+    "    Total_Length: int"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 47,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "<class '__main__.Test'>\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(Test)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "BlockScience",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/docs/FAQ.md b/docs/FAQ.md
@@ -1,7 +1,7 @@
 ---
 title: FAQ
 layout: page
-nav_order: 5
+nav_order: 6
 ---
 
 The following are frequently asked questions with regards to MSML.

diff --git a/docs/cadCADBuilder.md b/docs/cadCADBuilder.md
@@ -0,0 +1,55 @@
+---
+title: cadCAD Builder
+layout: page
+nav_order: 5
+---
+
+MSML provides interfaces for creating cadCAD style models that don't require end users poking around the actual code to use. The idea is that it provides a layer for data scientists to experiment just with toggling starting state and parameters and executing pre-packaged models.
+
+## Components
+
+### Inputs
+
+The following is required for building the cadCAD model:
+
+1. A math spec object with code bindings
+2. A list of blocks which should be executed for each timestep
+3. What state preparation and parameter preparation functions should be included to automatically run on experiment creation
+
+
+### Outputs
+
+The following is outputted by the creation function:
+
+1. State Space: The nested and typed dictionary of all types for the state in the simulation that the data scientist needs to define
+2. Parameter Space: The dictionary of parameters that the data scientist needs to define
+3. Model: The executable model that can spawn experiments to run given valid state space and parameter space passed
+
+## Creating a cadCAD Model in a Notebook
+
+The [template notebook on building cadCAD models](https://github.com/BlockScience/MSML-Template/blob/main/notebooks/Build%20cadCAD.ipynb) shows how to create a cadCAD model from an MSML model.
+
+## Using cadCAD Model Repositories
+
+A recommended design pattern is to have MSML developers create a batch of cadCAD models to be run within a cadCADModels folder [similar to the template](https://github.com/BlockScience/MSML-Template/tree/main/cadCADModels).
+
+The template also has an example of running the pre-built cadCAD models [here](https://github.com/BlockScience/MSML-Template/blob/main/notebooks/Pre-built%20cadCAD.ipynb).
+
+
+## Model Functionality
+
+### Experiment Creation
+
+- To create an experiment, one runs model.create_experiment(state, params) and passes in a valid state and parameter dictionary
+- The record_trajectory flag if set to true will record the state after each step of the simulation
+- The use_deepcopy flag if set to true will use deepcopy for the trajectory recording
+
+
+### Running Experiments
+
+- Using experiment.run(T) where T is the number of time steps will advance the simulations that many timesteps
+- If record_trajectory was set to true then the trajectories can be accessed from experiment.trajectories, if not you can always access the current state of the experiment by calling experiment.state
+
+### Batch Experiments
+
+- Batch experiments can be created and run through model.create_batch_experiments(...). An example of it is shown in the pre-built cadCAD notebook.
diff --git a/pyproject.toml b/pyproject.toml
@@ -3,7 +3,7 @@ requires = ["setuptools>=61.0"]
 build-backend = "setuptools.build_meta"
 [project]
 name = "math-spec-mapping"
-version = "0.4.1.2"
+version = "0.4.2"
 authors = [
   { name="Sean McOwen", email="[email protected]" },
 ]

diff --git a/research_notes/2025-01-27 Concept Specs.md b/research_notes/2025-01-27 Concept Specs.md
@@ -0,0 +1,132 @@
+# Concept Math Specs Ideation
+
+## Executive Summary
+
+- The following research note defines out the most parsimonious schemas and relations for a concept math spec
+- The idea is that this can be further enhanced and plugged into other interoperable workflows
+- For example, schemas for the spaces can be added in at a further point and potentially have specified types
+
+## Proposed JSON Schema
+
+### Space
+
+1. Name: PrimaryKey, string
+2. Description: Optional, string
+
+### Block
+
+1. Name: PrimaryKey, string
+2. Description: Optional, string
+3. Domain: ForeignKeys to Space.Name, List[string]
+4. Codomain: ForeignKeys to Space.Name, List[string]
+
+### Wiring
+
+- Might need a name or something for IDs
+- Might want to consider allowing multiple target and source ports
+
+#### Option 1:
+
+1. Source: string, such as "BlockA-1" for second port of BlockA codomain
+2. Target: string, such as "BlockB-0" for first port of BlockB domain
+
+#### Option 2:
+
+1. SourceBlock: ForeignKey to Block.Name, string
+2. TargetBlock: ForeignKey to Block.Name, string
+3. Source Port: Related to SourceBlock.Codomain as the index, integer
+4. Target Port: Related to TargetBlock.Domain as the index, integer
+
+
+### Systems
+
+1. Name: PrimaryKey, string
+2. Description: Optional, string
+3. Wirings: ForeignKeys to Wiring, List[strings]
+
+
+## Current MSML Wiring & Comparison
+
+- Currently a wiring needs two components - the type (either stack or parallel) and the components, a list of blocks or wirings (there is a recursive wiring builder that makes sure the wirings get created in the right order)
+- For a stack block, all adjacent blocks have validations that domain and codomain map
+- Internally, all the port wirings are automatically built out for both mapping as well as execution
+- Wirings in this way can be built out to automatically map to the proposed "Wiring" schema in concept specs or potentially changed
+- The wirings are similar to a DAG
+- A stack block wiring has domain of the first component and codomain of the last component
+- A parallel block has the summation of component domains and component codomains
+- There are future potential wiring types such as a looping block or having a notation for multiple outputs of the spaces, i.e. similar to if we have 1-N outputs that are variable between a block but they all follow the same codomain space schema
+
+
+## Proposed Relational Schema V2
+
+### Space
+
+1. ID: PrimaryKey
+2. Name: string
+3. Description: Optional, string
+
+### Block
+
+1. ID: PrimaryKey
+2. Name: string
+3. Description: Optional, string
+
+### Ports
+
+1. ID: PrimaryKey
+2. Name: string
+3. Index: Integer
+4. Space: ForeignKey to Space.Name, string
+
+### Terminals
+
+1. ID: PrimaryKey
+2. Name: string
+3. Index: Integer
+4. Space: ForeignKey to Space.Name, string
+
+## Proposed Schema V2
+
+### Space
+
+{"ID": PrimaryKey,
+"Name": string,
+"Description": string}
+
+### Block
+
+{"ID": PrimaryKey,
+"Name": string,
+"Description": string,
+"Domain": List[Space.ID],
+"Codomain": List[Space.ID]}
+
+## Concrete Block
+
+{"ID": PrimaryKey,
+"Name": string,
+"Description": string,
+"Parent": Block.ID}
+
+- It is legal to have many to one concrete blocks of abstract blocks
+
+## Wire (Concrete Space)
+
+{"ID": PrimaryKey,
+"Parent": Space.ID,
+"SourceBlock": ConcreteBlock.ID,
+"TargetBlock": ConcreteBlock.ID,
+"SourceIndex": int,
+"TargetIndex": int}
+
+- Check matches up
+
+## System
+
+{"ConcreteBlocks": List[ConcreteBlock.ID],
+"Wires": List[Wire.ID]}
+
+- Check that:
+    - All of the ports/domain need to be filled
+    - All domain ports only get one input
+    - Terminals can go 0...N ports (and they would just be sending replicas of the same output)