Skip to content

dylanabramson33/DAGFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DAGFlow

What is DAGFlow?

DAGFlow is a tool to create and manage directed acyclic graphs for training and deploying machine learning models.

Why Did I create DAGFlow?

I noticed myself creating lots of duplicated functionality across many ML projects. With DAGFlow, I hope to create a framework that encourages modular processing/querying for easy reuse of components across many projects. I also wanted a low-code solution to allow deployment of ML pipelines for scientists without webdev experience.

How it works

The core object in DAGFlow is the "flow"'. A flow defines a sequence of transformations of data through "Nodes". Below is an example flow that takes an input dataframe with a SMILES field (a string representation of molecule) and adds a column containing the bond adjacency matrix of the molecule:

from Flows import createflow
from Node import nodify
from rdkit import Chem
import pandas as pd

@nodify(node_type='Source', fields={'SMILES' : 'SMILES'})
def ChemCSVReader(inp : str) -> pd.DataFrame:
    df = pd.read_csv(inp)
    return df

@nodify(requires=['SMILES'],adds=['MOLS'])
def ChemAddMol(inp : pd.DataFrame) -> pd.DataFrame:
    inp['MOLS'] = inp['SMILES'].apply(Chem.MolFromSmiles)
    return inp

@nodify(requires=['MOLS'],adds=['GRAPH'])
def GetMoleculeGraph(inp : pd.DataFrame) -> pd.DataFrame:
    inp['GRAPH'] = inp['MOLS'].apply(Chem.rdmolops.GetAdjacencyMatrix)
    return inp
    
@createflow
def test_flow():
   df = ChemCSVReader('./solubility_data.csv')
   df = ChemAddMol(df)
   df = GetMoleculeGraph(df1)
   return df


test_flow.compile()
output = test_flow.run()

GUI

No DAG library is complete without a slick GUI. DAGFlow includes DAGWeb, a flask app for creating DAG's with a drag and drop UI. Here is the same flow as above represented in the GUI.
Screen Shot 2022-01-08 at 2 19 58 PM

The GUI contains a basic type checker (right now type annotations must be included but this will be optional in future), ensuring only nodes of compatible types can be linked together.

About

My take on directed acyclic graph library

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages